Monday, October 28, 2019

How Effective Is Artificial Intelligence In Healthcare?

Artificial intelligence (AI) and predictive modeling based on Big Data are going to be the key buzzwords in the coming decade. From banking to retail and from education to healthcare, data-intensive sectors are getting ready to be served by AI-driven computers programmed to make decisions with minimum human intervention. While interaction with “May-I-Help-You” chatbots is common on all websites today, how comfortable are we when it comes to sharing our medical history with a chatbot on visiting a hospital website? Usage of artificial intelligence in the healthcare sector is still at its infancy in India. Let’s take a look at how AI is changing healthcare globally and will do so in the future.

Can artificial intelligence change the face of healthcare globally?

Data Management

The first step in healthcare involves a compilation of previous health history and medical records. This is efficiently carried out by digital automation as all minute parameters affecting the patient’s health, and the dose-response details of all previous medication can be retracted and analyzed faster and more consistently. While consulting an individual patient with a certain set of symptoms, the doctor can pull out hundreds of other cases with similar symptoms from the database and discuss the reactions to a certain set of medications. It involves an informed discussion between both sides and brings medication to a personal care level and allows confidence building.

Performing Repetitive Analysis

Analysis of routine X-rays, ECGs, CT scans by automated robots will result in saving of a huge amount of hospital work time, and human intervention will be required in supervising only the most critical of cases. This will help hospitals in better resource management and deputation of expert manpower to saving lives in the Intensive Care Units (ICUs)and Intensive Therapy Units (ITUs).

Diagnosis And Predictive Consultation

AI can chart out a future course of action for present ailments and help outpatients and medical practitioners with predictive analysis. This can save on repetitive hospital visits for recurring ailments and provide medicare when the doctor is unavailable. Apps like Babylon in the U.K., offer medical consultation based upon the patient’s personal medical history ranked against a vast database of illness and common medical knowledge.

Digital Nurses

Boston Children’s Hospital, in 2016, developed an app for Amazon’s Alexa, which gave basic health information to parents of ill children. It also answers questions about medication and suggests a visit to a doctor after scanning symptoms. Molly, a digital nurse, has been developed by startup Sense.ly which monitors patient condition and follow-ups in between doctor visits for patients with a critical illness. In many cases, this has been found to reduce hospitalization time for patients

Creation of Drugs

Development of drugs through clinical trials takes over than a decade and involves huge costs. AI can aid in this process by scanning existing medicines including their differences in composition and effectiveness, can suggest redesigning of chemical formulations and combinations for tackling sudden medical exigencies or deadly outbreaks caused by new strains of viruses. This was found effective during the recent fight against Ebola, where AI suggested medication was found to be effective.

Health Management Apps

Wearable health trackers like FitBit, Apple and Gramin monitor real-time heart rates and activity, and charts out activity routines for the day, along with sending warning messages in case of certain parameters deteriorating, based upon the habits and needs of patients.

IIoT Automation- When’s the Right Time to Invest in Automation?



6 Steps to Bring Clarity to Industrial IoT (IIoT) Automation

The allure of process automation is growing as it becomes more attainable and affordable. In the past, owning a sleek robot to assemble widgets at high speeds was as likely as having a Lamborghini sitting in your garage. But, today, automated solutions are as numerous and dependable as family sedans. Automation can provide a practical, robust and long-term solution. But, how do you know when and where to upgrade?

A good process for determining this will include the following six steps.

Begin Without Limits

1. Start with a Pie-in-the-Sky List

Make a list of ways you would automate if you had unlimited funds. The answer “everything” isn’t detailed enough. Really consider why you would automate, and what benefits the automation would bring. The “why” is especially important to the specification. Sometimes automation is viewed as the “easy” fix to problems that can and should be addressed in other ways.
Think of the process like you would when specifying your dream car. Your wish list would be long and detailed. You would pick out everything from the custom intake to the finish on the dash. Treat your automation project the same way.

2. Find the “How”

Systems integrators and automation engineers are experts in what can be accomplished reliably, using today’s technology and control systems. They can help guide you through the next step, which involves determining the actual monetary costs and benefits of automated solutions. You know what you need to automate; they know what the automation process requires.

Justify the Project; Quantify Benefits

3. Prove It

The third step is to take the most important pieces of your dream IIoT automation plan and prove or disprove the long-term financial benefits. Make sure that you can justify the costs of each automation need. For example, does the speed and production volume of your packaging line justify an automated packing station? Would manual packing operations be more financially suitable and better matched to your production needs?
Another consideration is the complexity of what you’re trying to automate. A large percentage of the automation costs is based on the number of tasks the system must perform. Conveying product and cases, for example, is one task that’s easy and cheap to automate. Assembling and printing a multi-part package where the machine must measure, rotate, count, index and assemble is going to be complex (and much more expensive to automate).
Machines aren’t human beings and must be programmed and designed. When raw material variability, multi-axis coordination and placement are involved, the machine must consistently handle these tasks with a high degree of reliability. How many moving parts are involved? How quickly must the task be performed? How many different packages or stock-keeping unit (SKU) numbers are the machine required to run? The answers to these questions can drive up the cost of automation.
When proving the value, use actual numbers. Try to assign a monetary amount to how much you can save versus how much you will spend to automate. What’s my return on investment (ROI) and does it meet my company’s requirements? A big cost factor will be the speed of the line or process—does the increase in the output (or products you can sell) justify the automation expense?
Factor in as many differences as you can determine: the cost of interruptions to existing operations for automated upgrades, manual labor expenses, differences in upkeep costs, spare part costs, operator/maintenance training costs and scrap costs, for example. Use a long-term view when trying to measure these costs, and determine your automation “break-even” point. How long will your automated line have to run to pay for the automation upgrade?
The allure of process automation is growing as it becomes more attainable and affordable. Knowing how, when and where to upgrade are essential steps to implementing industrial automation. || #IoTForAll #IoT #IIoTCLICK TO TWEET

Practical Example

Always look for the greatest impact at the lowest cost when prioritizing when and where to automate. Determine the functions to automate, how difficult automation would be and the costs versus the benefits. An example set of considerations for a packaging line is below:
You can prioritize automation tasks based on this chart. A few observations you might make:
  • There are several low cost-to-automate functions that could quickly improve line speed. Case erection and case sealing are good examples. These options might be good first stage improvements.
  • Consider where in the line these opportunities exist. Palletizing is an end-of-the-line function. It won’t have much effect on product output on a given day. Do you need to automate it?
  • Some of the more complicated tasks might not be worth automating yet. Bundle packing is a good example. Bundle packing involves bringing multiple products together, orienting the product, and shrinking a sleeve around the entire arrangement. This is going to be a complicated process to automate. Conversely, do you have the space required for a hand-pack station? Are workers able to hand-pack at a pace that meets your output requirements? Bundle packing is a mid-stream operation and could become the rate-limiter for the line.
  • Some things are worth automating because of the increase in line-speed they afford. For example, filling or case packing can greatly increase speed. Does the increase in the number of products justify the expense of automating these pieces?

Essential Elements

4. Determine Your Backbone

For anyone looking at an automated solution, there’s an essential base level of controls that must exist to make the rest possible. What is the essential base of automation for your project? Determine the basic automated structures that must exist and consider those your backbone. An experienced integrator will determine these needs and provide room for future growth at a reasonable cost.
Many automation projects move forward in stages. The backbone automation is the first stage—and maybe the only automation work you do in year one. You can spread the costs of automation by establishing this base and adding on as you move forward. Assembly could stay manual while the controls backbone is installed and the conveying is automated. Develop a timeline for the major pieces, planning future upgrades.

5. Safety and Compliance Matters

Automation is a great way to improve overall safety and compliance and generally brings a greater measure of reliability than human beings. Lockout/Tagout (LOTO) systems, machine guards, light curtains and other safety measures can be easily added. Safety practices and compliance measures must be a part of your plan.
Depending on the safety measures you will be using, the financial implications of safety systems vary. This is an area where an automation consultant can help. Automation constantly changes, and your process might change your required safety standards.
Enzyme use in consumer products manufacturing is an example of how safety can impact an automation project. Enzymes used in manufacturing can become a problem when used in large amounts. Increasing the volume of your production with automation might mean that you must account for these elevated levels of enzymes. Is an HVAC system necessary? Do workers need to wear PPE? Are room modifications required?
Will automation make your process inherently more dangerous? Are there points on the line where automation isn’t safe? Strategically choosing to require an operator to manually shut off a valve is an example. This feature ensures that the line operator must physically check the production line in specified intervals. This is an additional check for safety.
Sometimes compliance is a reason to automate. Your industry may suddenly require companies to ensure the absence of metal or foreign objects present in products. Costs of compliance would be an operational cost, as you must comply to stay in business. In many cases, automation can often provide superior reliability and speed over manual solutions.

Think You’re Finished?

6. Reduce, Review and Revise

Go back over your entire plan. Automation isn’t a stand-alone element. Usually, there are civil upgrades, logistical requirements, ergonomic considerations, equipment purchases and labor costs associated with it.
Are there areas outside of IIoT Automation where costs can be reduced? For example, are stainless tanks required, or will plastic tanks meet your needs? Do your civil and mechanical upgrades match your automation stages in terms of scope and timing? Consider the whole project, not just the automation part. Reducing your civil and mechanical scope may allow more automation up front.
Going through this entire thought process will help you develop a comprehensive automation strategy that considers costs, timeline and benefits. Much like car shopping, first, separate the wants from the needs and take a practical look at what’s possible. Once you decide to automate, it is a long-term commitment that will positively impact your business for years to come. Time invested up front will result in a much better automation solution for you and peace-of-mind moving forward.

Mathematics for AI :All the important maths topics for AI and ML you need to know.



“The key to artificial intelligence has always been the representation.” — Jeff Hawkins


As we know Artificial Intelligence has gained importance in the last decade with a lot depending on the development and integration of AI in our daily lives. The progress that AI has already made is astounding with the self-driving cars, medical diagnosis and even betting humans at strategy games like Go and Chess.
The future for AI is extremely promising and it isn’t far from when we have our own robotic companions. This has pushed a lot of developers to start writing codes and start developing for AI and ML programs. However, learning to write algorithms for AI and ML isn’t easy and requires extensive programming and mathematical knowledge.
Mathematics plays an important role as it builds the foundation for programming for these two streams. And in this article, I’ve covered exactly that. Idesigned a complete articleto help you master the mathematical foundation required for writing programs and algorithms for AI and ML.
So I will directly go to the main objective of this article:
My recommendation of learning mathematics for AI goes like this:

Linear Algebra:

Linear algebra is used in Machine Learning to describe the parameters and structure of different machine learning algorithms. This makes linear algebra a necessity to understand how neural networks are put together and how they are operating.
It cover topics like this:
  • Scalars, Vectors, Matrices, Tensors
  • Matrix Norms
  • Special Matrices and Vectors Eigenvalues and Eigenvectors
  • Principle component analysis
  • Singular value decomposition

Calculus:

This is used to supplement the learning part of machine learning. It is what is used to learn from examples, update the parameters of different models and improve the performance.
It cover topics like this:
  • Derivatives(Scalar Derivative-Chain rule),Partial and Directional Derivative.
  • Integrals
  • Gradients
  • Differential Operators
  • Convex Optimization
  • Gradient algorithms- local/global maxima and minima,SGD,NAG,MAG,Adams

Probability Theory:

The theories are used to make assumptions about the underlying data when we are designing these deep learning or AI algorithms. It is important for us to understand the key probability distributions.
It covers topics such as:
  • Elements of Probability
  • Random Variables
  • Distributions( binomial, bernoulli, poisson, exponential, gaussian)
  • Variance and Expectation
  • Bayes’ Theorem, MAP, MLE
  • Special Random Variables

Others

  • Markov Chain
  • Information Theory

From where you can learn:

  • Youtube Videos
  • Textbooks
  • Online Course
  • Google Search
Reading above topics, you will not have not only the knowledge to build your own algorithms, but also the confidence to actually start putting your algorithms to use in your next projects and learn exacly how to use concepts in real life.
Copyright https://medium.com

What Is Machine Learning?

What Is Machine Learning?

Machine learning is one of the quickest growing technological fields, but despite how often the words “machine learning” are tossed around, it can be difficult to understand what machine learning is, precisely.
Machine learning doesn’t refer to just one thing, it’s an umbrella term that can be applied to many different concepts and techniques. Understanding machine learning means being familiar with different forms of model analysis, variables, and algorithms. Let’s take a close look at machine learning to better understand what it encompasses.

What Is Machine Learning?

While the term machine learning can be applied to many different things, in general, the term refers to enabling a computer to carry out tasks without receiving explicit line-by-line instructions to do so. A machine learning specialist doesn’t have to write out all the steps necessary to solve the problem because the computer is capable of “learning” by analyzing patterns within the data and generalizing these patterns to new data.
Machine learning systems have three basic parts:
  • Inputs
  • Algorithms
  • Outputs
The inputs are the data that is fed into the machine learning system, and the input data can be divided into labels and features. Features are the relevant variables, the variables that will be analyzed to learn patterns and draw conclusions. Meanwhile, the labels are classes/descriptions given to the individual instances of the data.
Features and labels can be used in two different types of machine learning problems: supervised learning and unsupervised learning.

Unsupervised vs. Supervised Learning

In supervised learning, the input data is accompanied by a ground truth. Supervised learning problems have the correct output values as part of the dataset, so the expected classes are known in advance. This makes it possible for the data scientist to check the performance of the algorithm by testing the data on a test dataset and seeing what percentage of items were correctly classified.
In contrast, unsupervised learning problems do not have ground truth labels attached to them. A machine learning algorithm trained to carry out unsupervised learning tasks must be able to infer the relevant patterns in the data for itself.
Supervised learning algorithms are typically used for classification problems, where one has a large dataset filled with instances that must be sorted into one of many different classes. Another type of supervised learning is a regression task, where the value output by the algorithm is continuous in nature instead of categorical.
Meanwhile, unsupervised learning algorithms are used for tasks like density estimation, clustering, and representation learning. These three tasks need the machine learning model to infer the structure of the data, there are no predefined classes given to the model.
Let’s take a brief look at some of the most common algorithms used in both unsupervised learning and supervised learning.
Supervised Learning
Common supervised learning algorithms include:
  • Naive Bayes
  • Support Vector Machines
  • Logistic Regression
  • Random Forests
  • Artificial Neural Networks
Support Vector Machines are algorithms that divide up a dataset into different classes. Data points are grouped into clusters by drawing lines that separate the classes from one another. Points found on one side of the line will belong to one class, while the points on the other side of the line are a different class. Support Vector Machines aim to maximize the distance between the line and the points found on either side of the line, and the greater the distance the more confident the classifier is that the point belongs to one class and not another class.
Logistic Regression is an algorithm used in binary classification tasks when data points need to be classified as belonging to one of two classes. Logistic Regression works by labeling the data point either a 1 or a 0. If the perceived value of the data point is 0.49 or below, it is classified as 0, while if it is 0.5 or above it is classified as 1.
Decision Tree algorithms operate by dividing datasets up into smaller and smaller fragments. The exact criteria used to divide the data is up to the machine learning engineer, but the goal is to ultimately divide the data up into single data points, which will then be classified using a key.
A Random Forest algorithm is essentially many single Decision Tree classifiers linked together into a more powerful classifier.
The Naive Bayes Classifier calculates the probability that a given data point has occurred based on the probability of a prior event occurring. It is based on Bayes Theorem and it places the data points into classes based on their calculated probability. When implementing a Naive Bayes classifier, it is assumed that all the predictors have the same influence on the class outcome.
An Artificial Neural Network, or multi-layer perceptron, are machine learning algorithms inspired by the structure and function of the human brain. Artificial neural networks get their name from the fact that they are made out of many nodes/neurons linked together. Every neuron manipulates the data with a mathematical function. In artificial neural networks, there are input layers, hidden layers, and output layers.
The hidden layer of the neural network is where the data is actually interpreted and analyzed for patterns. In other words, it is where the algorithm learns. More neurons joined together make more complex networks capable of learning more complex patterns.
Unsupervised Learning
Unsupervised Learning algorithms include:
  • K-means clustering
  • Autoencoders
  • Principal Component Analysis
K-means clustering is an unsupervised classification technique, and it works by separating points of data into clusters or groups based on their features. K-means clustering analyzes the features found in the data points and distinguishes patterns in them that make the data points found in a given class cluster more similar to each other than they are are to clusters containing the other data points. This is accomplished by placing possible centers for the cluster, or centroids, in a graph of the data and reassigning the position of the centroid until a position is found that minimizes the distance between the centroid and the points that belong to that centroid’s class. The researcher can specify the desired number of clusters.
Principal Component Analysis is a technique that reduces large numbers of features/variables down into a smaller feature space/fewer features. The “principal components” of the data points are selected for preservation, while the other features are squeezed down into a smaller representation. The relationship between the original data potions is preserved, but since the complexity of the data points is simpler, the data is easier to quantify and describe.
Autoencoders are versions of neural networks that can be applied to unsupervised learning tasks. Autoencoders are capable of taking unlabeled, free-form data and transforming them into data that a neural network is capable of using, basically creating their own labeled training data. The goal of an autoencoder is to convert the input data and rebuild it as accurately as possible, so it’s in the incentive of the network to determine which features are the most important and extract them.
Copyright unite.ai

What Is Deep Learning?

What Is Deep Learning?

Deep learning is one of the most influential and fastest growing fields in artificial intelligence. However, getting an intuitive understanding of deep learning can be difficult because the term deep learning covers a variety of different algorithms and techniques. Deep learning is also a subdiscipline of machine learning in general, so it’s important to understand what machine learning is in order to understand deep learning.

Machine Learning

Deep learning is an extension of some of the concepts originating from machine learning, so for that reason, let’s take a minute to explain what machine learning is.
Put simply, machine learning is a method of enabling computers to carry out specific tasks without explicitly coding every line of the algorithms used to accomplish those tasks. There are many different machine learning algorithms, but one of the most commonly used algorithms is a multilayer perceptron. A multilayer perceptron is also referred to as a neural network, and it is comprised of a series of nodes/neurons linked together. There are three different layers in a multilayer perceptron: the input layer, the hidden layer, and the output layer.
The input layer takes the data into the network, where it is manipulated by the nodes in the middle/hidden layer. The nodes in the hidden layer are mathematical functions that can manipulate the data coming from the input layer, extracting relevant patterns from the input data. This is how the neural network “learns”. Neural networks get their name from the fact that they are inspired by the structure and function of the human brain.
The connections between nodes in the network have values called weights. These values are essentially assumptions about how the data in one layer is related to the data in the next layer. As the network trains the weights are adjusted, and the goal is that the weights/assumptions about the data will eventually converge on values that accurately represent the meaningful patterns within the data.
Activation functions are present in the nodes of the network, and these activation functions transform the data in a non-linear fashion, enabling the network to learn complex representations of the data. Activation functions multiply the input values by the weight values and add a bias term.

Defining Deep Learning

Deep learning is the term given to machine learning architectures that join many multilayer perceptrons together, so that there isn’t just one hidden layer but many hidden layers. The “deeper” that the deep neural network is, the more sophisticated patterns the network can learn.
The deep layer networks comprised of neurons are sometimes referred to as fully connected networks or fully connected layers, referencing the fact that a given neuron maintains a connection to all the neurons surrounding it. Fully connected networks can be combined with other machine learning functions to create different deep learning architectures.

Different Deep Learning Architectures

There are a variety of deep learning architectures used by researchers and engineers, and each of the different architectures has its own specialty use case.
Convolutional Neural Networks
Convolutional neural networks, or CNNs, are the neural network architecture commonly used in the creation of computer vision systems. The structure of convolutional neural networks enables them to interpret image data, converting them into numbers that a fully connected network can interpret. A CNN has four major components:
  • Convolutional layers
  • Subsampling/pooling layers
  • Activation functions
  • Fully connected layers
The convolutional layers are what takes in the images as inputs into the network, analyzing the images and getting the values of the pixels. Subsampling or pooling is where the image values are converted/reduced to simplify the representation of the images and reduce the sensitivity of the image filters to noise. The activation functions control how the data flows from one layer to the next layer, and the fully connected layers are what analyze the values that represent the image and learn the patterns held in those values.
RNNs/LSTMs
Recurrent neural networks, or RNNs, are popular for tasks where the order of the data matters, where the network must learn about a sequence of data. RNNs are commonly applied to problems like natural language processing, as the order of words matters when decoding the meaning of a sentence.  The “recurrent” part of the term Recurrent Neural Network comes from the fact that the output for a given element in a sequence in dependant on the previous computation as well as the current computation. Unlike other forms of deep neural networks, RNNs have “memories”, and the information calculated at the different time steps in the sequence is used to calculate the final values.
There are multiple types of RNNs, including bidirectional RNNs, which take future items in the sequence into account, in addition to the previous items, when calculating an item’s value. Another type of RNN is a Long Short-Term Memory, or LSTM, network. LSTMs are types of RNN that can handle long chains of data. Regular RNNs may fall victim to something called the “exploding gradient problem”. This issue occurs when the chain of input data becomes extremely long, but LSTMs have techniques to combat this problem.
Autoencoders
Most of the deep learning architectures mentioned so far are applied to supervised learning problems, rather than unsupervised learning tasks. Autoencoders are able to transform unsupervised data into a supervised format, allowing neural networks to be used on the problem.
Autoencoders are frequently used to detect anomalies in datasets, an example of unsupervised learning as the nature of the anomaly isn’t known. Such examples of anomaly detection include fraud detection for financial institutions. In this context, the purpose of an autoencoder is to determine a baseline of regular patterns in the data and identify anomalies or outliers.
The structure of an autoencoder is often symmetrical, with hidden layers arrayed such that the output of the network resembles the input. The four types of autoencoders that see frequent use are:
  • Regular/plain autoencoders
  • Multilayer encoders
  • Convolutional encoders
  • Regularized encoders
Regular/plain autoencoders are just neural nets with a single hidden layer, while multilayer autoencoders are deep networks with more than one hidden layer. Convolutional autoencoders use convolutional layers instead of, or in addition to, fully-connected layers. Regularized autoencoders use a specific kind of loss function that lets the neural network carry out more complex functions, functions other than just copying inputs to outputs.
Generative Adversarial Networks
Generative Adversarial Networks (GANs) are actually multiple deep neural networks instead of just one network. Two deep learning models are trained at the same time, and their outputs are fed to the other network. The networks are in competition with each other, and since they get access to each other’s output data, they both learn from this data and improve. The two networks are essentially playing a game of counterfeit and detection, where the generative model tries to create new instances that will fool the detective model/the discriminator. GANs have become popular in the field of computer vision.

Summing Up

Deep learning extends the principles of neural networks to create sophisticated models that can learn complex patterns and generalize those patterns to future datasets. Convolutional neural networks are used to interpret images, while RNNs/LSTMs are used to interpret sequential data. Autoencoders can transform unsupervised learning tasks into supervised learning tasks. Finally, GANs are multiple networks pitted against each other that are especially useful for computer vision tasks.
Copyright unite.ai


What Is A Decision Tree?

What Is A Decision Tree?

decision tree is a useful machine learning algorithm used for both regression and classification tasks. The name “decision tree” comes from the fact that the algorithm keeps dividing the dataset down into smaller and smaller portions until the data has been divided into single instances, which are then classified. If you were to visualize the results of the algorithm, the way the categories are divided would resemble a tree and many leaves.
That’s a quick definition of a decision tree, but let’s take a deep dive into how decision trees work. Having a better understanding of how decision trees operate, as well as their use cases, will assist you in knowing when to utilize them during your machine learning projects.
General Format of a Decision Tree
A decision tree is a lot like a flowchart. To utilize a flowchart you start at the starting point, or root, of the chart and then based on how you answer the filtering criteria of that starting node you move to one of the next possible nodes. This process is repeated until an ending is reached.
Decision trees operate in essentially the same manner, with every internal node in the tree being some sort of test/filtering criteria. The nodes on the outside, the endpoints of the tree, are the labels for the datapoint in question and they are dubbed “leaves”. The branches that lead from the internal nodes to the next node are features or conjunctions of features. The rules used to classify the datapoints are the paths that run from the root to the leaves.
Steps and Algorithms
Decision trees operate on an algorithmic approach which splits the dataset up into individual data points based on different criteria. These splits are done with different variables, or the different features of the dataset. For example, if the goal is to determine whether or not a dog or cat is being described by the input features, variables the data is split on might be things like “claws” and “barks”.
So what algorithms are used to actually split the data into branches and leaves? There are various methods that can be used to split a tree up, but the most common method of splitting is probably a technique referred to as “recursive binary split”. When carrying out this method of splitting, the process starts at the root and the number of features in the dataset represents the possible number of possible splits. A function is used to determine how much accuracy every possible split will cost, and the split is made using the criteria that sacrifices the least accuracy. This process is carried out recursively and sub-groups are formed using the same general strategy.
In order to determine the cost of the split, a cost function is used. A different cost function is used for regression tasks and classification tasks. The goal of both cost functions is to determine which branches have the most similar response values, or the most homogenous branches. Consider that you want test data of a certain class to follow certain paths and this makes intuitive sense.
In terms of the regression cost function for recursive binary split, the algorithm used to calculate the cost is as follows:
sum(y – prediction)^2
The prediction for a particular group of data points is the mean of the responses of the training data for that group. All the data points are run through the cost function to determine the cost for all the possible splits and the split with the lowest cost is selected.
Regarding the cost function for classification, the function is as follows:
G = sum(pk * (1 – pk))
This is the Gini score, and it is a measurement of the effectiveness of a split, based on how many instances of different classes are in the groups resulting from the split. In other words, it quantifies how mixed the groups are after the split. An optimal split is when all the groups resulting from the split consist only of inputs from one class. If an optimal split has been created the “pk” value will be either 0 or 1 and G will be equal to zero. You might be able to guess that the worst-case split is one where there is a 50-50 representation of the classes in the split, in the case of binary classification. In this case, the “pk” value would be 0.5 and G would also be 0.5.
The splitting process is terminated when all the data points have been turned into leaves and classified. However, you may want to stop the growth of the tree early. Large complex trees are prone to overfitting, but several different methods can be used to combat this. One method of reducing overfitting is to specify a minimum number of data points that will be used to create a leaf. Another method of controlling for overfitting is restricting the tree to a certain maximum depth, which controls how long a path can stretch from the root to a leaf.
Another process involved in the creation of decision trees is pruning. Pruning can help increase the performance of a decision tree by stripping out branches containing features that have little predictive power/little importance for the model. In this way, the complexity of the tree is reduced, it becomes less likely to overfit, and the predictive utility of the model is increased.
When conducting pruning, the process can start at either the top of the tree or the bottom of the tree. However, the easiest method of pruning is to start with the leaves and attempt to drop the node that contains the most common class within that leaf. If the accuracy of the model doesn’t deteriorate when this is done, then the change is preserved. There are other techniques used to carry out pruning, but the method described above – reduced error pruning – is probably the most common method of decision tree pruning.
Considerations For Using Decision Trees
Decision trees are often useful when classification needs to be carried out but computation time is a major constraint. Decision trees can make it clear which features in the chosen datasets wield the most predictive power. Furthermore, unlike many machine learning algorithms where the rules used to classify the data may be hard to interpret, decision trees can render interpretable rules. Decision trees are also able to make use of both categorical and continuous variables which means that less preprocessing is needed, compared to algorithms that can only handle one of these variable types.
Decision trees tend not to perform very well when used to determine the values of continuous attributes. Another limitation of decision trees is that, when doing classification, if there are few training examples but many classes the decision tree tends to be inaccurate.

Racial bias in a medical algorithm favors white patients over sicker black patients

A widely used algorithm that predicts which patients will benefit from extra medical care dramatically underestimates the health needs of...