Friday, November 1, 2019

Racial bias in a medical algorithm favors white patients over sicker black patients



A widely used algorithm that predicts which patients will benefit from extra medical care dramatically underestimates the health needs of the sickest black patients, amplifying long-standing racial disparities in medicine, researchers have found.
The problem was caught in an algorithm sold by a leading health services company, called Optum, to guide health care decision-making for millions of people. But the same issue almost certainly exists in other tools used by other private companies, nonprofit health systems and government agencies to manage the health care of about 200 million people in the United States each year, the scientists reported in the journal Science .
Correcting the bias would more than double the number of black patients flagged as at risk of complicated medical needs within the health system the researchers studied, and they are already working with Optum on a fix. When the company replicated the analysis on a national data set of 3.7 million patients, they found that black patients who were ranked by the algorithm as equally as in need of extra care as white patients were much sicker: They collectively suffered from 48,772 additional chronic diseases.
“It’s truly inconceivable to me that anyone else’s algorithm doesn’t suffer from this,” said Sendhil Mullainathan, a professor of computation and behavioral science at the University of Chicago Booth School of Business, who oversaw the work. “I’m hopeful that this causes the entire industry to say, ‘Oh, my, we’ve got to fix this.’”
The algorithm wasn’t intentionally racist — in fact, it specifically excluded race. Instead, to identify patients who would benefit from more medical support, the algorithm used a seemingly race-blind measure: how much patients would cost the health care system in the future. But cost isn’t a race-neutral measure of health care need. Black patients incurred about $1,800 less in medical costs per year than white patients with the same number of chronic conditions; thus the algorithm scored white patients as equally at risk of future health problems as black patients who had many more diseases.
Machines increasingly make decisions that affect human life, and big organizations — particularly in health care — are trying to leverage massive data sets to improve how they operate. They utilize data that may not appear to be racist or biased, but may have been heavily influenced by longstanding social, cultural and institutional biases — such as health care costs. As computer systems determine which job candidates should be interviewed, who should receive a loan or how to triage sick people, the proprietary algorithms that power them run the risk of automating racism or other human biases. […]

AI in DC

NVIDIA Brings AI To DC

Nearly every enterprise is experimenting with  and .

It seems like every week there’s a new survey out detailing the ever-increasing amount of focus that IT shops of all sizes put on the technology. If it’s true that data is the new currency, then it’s  that mines that data for value. Your C-suite understands that, and its why they continually push to build  and  capabilities.
Nowhere is / more impactful than in the world of government and government contractors. It’s not just the usual suspects of defense and intelligence who demand these capabilities—/ is fast becoming a fact-of-life across the spectrum of government agencies. If you’re a government contractor, then you’re already seeing / in an increasing number of RFP/RFQs.

 impacts everything

I’m a storage analyst. I don’t like to think about . I like to think about data. I advise my clients on how storage systems and data architecture must evolve to meet the needs of emerging and disruptive technologies. These days, those technologies all seem to be some variation of containerized deployments, hybrid-cloud infrastructure and enterprise . There’s no question that Artificial Intelligence is the most disruptive.
High-power GPUs dominate . Depending on the problem you’re trying to solve, that may be one GPU in a data scientist’s workstation, or it may be a cluster of hundreds of GPUs. It’s also a certainty that your deployment will scale over time in ways that you can’t predict today.
That uncertainty forces you to architect your data center to support the unknown. That could mean deploying storage systems that have scalable multi-dimensional performance that can keep the GPUs fed, or simply ensuring that your data lakes are designed to reduce redundancies and serve the needs of all that data’s consumers.
These aren’t problems of implementing , but rather designing an infrastructure that can support it. Most of us aren’t  experts. We manage storage, servers, software or networking. These are all things will be disrupted by  in the data center.
The single best way to prepare for the impacts of  in the data center is to become educated on what it is and how it’s used. The dominant force in  and GPU technology for  is NVIDIA. Thankfully, NVIDIA has a conference to help us all out.

NVIDIA’s GPU technology conference for 

Every spring NVIDIA hosts its massive GPU Technology Conference (GTC) near its headquarters in Silicon Valley. It’s there where 6,000+ attendees gather to hear about all aspects of what NVIDIA’s GPUs can do. This ranges from graphics for gaming and visualization, to inference at the edge, to  in the enterprise. It’s one of my favorite events each year (read my recap the most recent GTC here, if interested). […]

What Kind of Problems Can Machine Learning Solve?

What Kind of Problems Can Machine Learning Solve?
This article is the first in a series we’re calling “Opening the Black Box: How to Assess Machine Learning Models.”

Properly deploying  within an organization involves considering and answering three core questions:
The use of  technology is spreading across all areas of modern organizations, and its predictive capabilities suit the finance function’s forward-looking needs. Understanding how to work with  models is crucial for making informed investment decisions.
Yet, for many finance professionals, successfully employing them is the equivalent of navigating the Bermuda Triangle.
Properly deploying  within an organization involves considering and answering three core questions:
  1. Does this project match the characteristics of a typical  problem?
  2. Is there a solid foundation of data and experienced analysts?
  3. Is there a tangible payoff?

Does This Project Match the Characteristics of a Typical  Problem?

Machine learning is a subset of  that’s focused on training computers to use algorithms for making predictions or classifications based on observed data.
Finance functions typically use “supervised” , where an analyst provides data that includes the outcomes and asks the machine to make a prediction or classification based on similar data.
With “unsupervised” , data is provided without outcomes and the machine attempts to glean them. However, given the popularity of the supervised models within finance functions, our articles will focus on such models.
To present a very simple example in which you were attempting to train a model that predicts A + B = C using supervised , you would give it a set of observations of A, B, and the outcome C.
You would then tell an algorithm to predict or classify C, given A and B. With enough observations, the algorithm will eventually become very good at predicting C. With respect to this example, the problem is well solved by humans.
But what if the question was A+B+…+F(X) = Z?
Traditionally, humans would tackle that problem by simplifying the equation — by removing factors and introducing their own subjectivity. As a result, potentially important factors and data are not considered. A machine can consider all the factors and train various algorithms to predict Z and test its results.
In short,  problems typically involve predicting previously observed outcomes using past data. The technology is best suited to solve problems that require unbiased analysis of numerous quantified factors in order to generate an outcome.

Is There a Solid Foundation of Data?

Machine learning models require data. As noted earlier, the data must also include observable outcomes, or “the right answer,” for  to predict or classify.
For instance, if you are trying to predict what credit rating a private company might attain based on its financial statements, you need data that contains other companies’ financial statements and credit ratings. The  model will look at all the financial statement data and the observable outcomes (in this case the other companies’ credit ratings), and then predict what the private company credit rating might be. […]

How Machine Learning Could Impact the Future of Renewable Energy

How Machine Learning Could Impact the Future of Renewable Energy

More and more cities are looking to go green. And renewable energy is, if current trends hold, the future of the energy industry.


But as renewable energy technologies like wind farms are implemented at larger scales than ever, local officials are running into their limitations. The energy production of wind farms is hard to predict, and this makes energy grid design difficult.
Experts hope that  can be applied to renewable energy to solve this problem. If it works, this new tech may make energy officials more enthusiastic about implementing renewables.
One downside of renewables is how hard it can be to predict the energy they produce. Wind speeds can vary widely from hour to hour and from day to day. You can average out how much wind a certain place gets over the course of a long period of time. And you can also use that information to figure out how much energy a wind farm may produce per year. But it’s much harder to accurately predict the energy a wind farm will produce on a given day or at a certain time.
Inaccurate predictions mean it’s harder to know if construction costs will be worth it. With renewables, too much and too little are both big problems. Create too little power and you’ll need to have supplemental energy sources at the ready. Generate too much power and you’ll need to either store that energy or waste it. And battery technology is just too expensive right now to store renewable energy at any sort of useful scale.
Machine learning technology — computer programs that use data sets to “learn” how to see patterns in information like wind speed and energy output — may be the answer to wind farms’ prediction problem.
The same  tech, experts think, could be used to make green energy more predictable. In February 2019, Google announced that it was using DeepMind, the company’s in-house  technology, to predict the energy output of wind farms.
The  technology has already made wind farm predictions 20 percent more valuable, according to Google. And better value means that wind farms may be seen as a safer investment by municipal officials who control which kinds of energy projects get built.
Will  build better wind farms? It’s hard to say. But  has been successful in related fields.
The weather is notoriously difficult to predict, for many of the same reasons that it’s hard to predict wind speeds. A good prediction needs to take into account more variables than a person can keep track of — like changing levels of humidity, pressure and temperature. Predicting the weather is so hard, in fact, that IBM acquired The Weather Company to see if  could make weather predictions better. The results? According to IBM, they achieved a nearly 200 percent increase in the accuracy of forecasts. […]

Racial bias in a medical algorithm favors white patients over sicker black patients

A widely used algorithm that predicts which patients will benefit from extra medical care dramatically underestimates the health needs of...