TABLE OF CONTENTS
Guest Speaker
References
Introduction
Transcript
Social determinants of health (SDoH) is an important variable to consider when interacting with Medicare Advantage populations and as the healthcare system increases its use of digital technologies, such as telehealth visits, online prescriptions, and digital communications, digital inequality is becoming a forefront issue within SDoH.
Digital inequality examines populations to determine internet accessibility, internet literacy, and how people of different sub-populations are using the internet. All of these factors influence how plans should develop messaging and outreach programs.
In order to collect this information and develop solutions to reach this population, health plan data scientists have to be conscientious of SDoH when building algorithms to avoid biases. For example, a new strategy for closing gaps is to offer telehealth services. To test the efficacy of this strategy, a health plan may build a model to predict a member's likelihood of using telehealth services using cohorts of urban versus rural member data. If they don't include data related to digital inequality, they may come to a faulty conclusion that telehealth is a good solution for its rural members. By not accounting for digital inequality, the telehealth program may result in poor engagement. By including SDoH data, health plans can identify micro-segments of the population and innovate solutions to address each cohort. In this case, the health plan might plan to partner with an internet service provider and pay for a portion of the internet fees to enable their offline member's access to telehealth visits.
The takeaway is the data used to feed algorithms must be consistent with the population the logic is applied to. One way to ensure that models are built accurately is to ensure data science teams are well rounded and include not just highly technical data scientists but also social scientists to account for the specific needs and challenges of any given population.
Guest Speaker
Brandon Brooks
Data Scientist
Brandon Brooks is a Data Scientist working with Member Acquisition & Engagement technology, a machine learning solution for Medicare Advantage plans. With over ten years of experience in computational social science research, he is an expert in human communications and engagement in digital ecosystems using data and behavioral sciences. He’s worked in several industries including healthcare, energy, environmental, education and information technology. With an appreciation for details and analytics, Brandon is highly skilled at telling stories with data and enjoys working on complex problems without a clear solution.
Host: Welcome to Episode 10, Ethics, Data Science, and SDoH. We’re joined today by data scientist Brandon Brooks. Welcome, Brandon. I’m thrilled to have you on the show today.
Brandon: Hey, yeah, it’s great to be here.
Host: Brandon, you have an incredible professional background. You’ve got your PhD in Media and Information Studies. You taught big data, social network analysis, and social science research methods as a university professor. Then, you got into analytics consulting to help organizations make better decisions using data-informed policy. Now, you’re with Advantasure, focused on developing analytics products for health plans and providers around the country. Specifically developing models that improve targeted digital engagement outreach efforts with members in the Medicare Advantage and Medigap product areas.
Brandon: Yeah, it sounds like a broad array of experiences, but a common professional theme for me in the way I view my role as a data scientist—is to help organizations make ethical data science decisions to improve people’s lives.
For example, one field that I have worked in, digital inequality, is all about looking at populations to determine who has access to the internet, who does or doesn’t have internet literacy, and how are people of different sub-populations using the internet, which of course influences how you develop messaging and outreach campaigns to those people. As a society we have to figure out how we can help people get better access to the internet, how do you improve people’s literacy and ultimately how does that impact their quality of life? Because different social factors impact the perception of digital messages. These questions are important. Factors like digital inequality and other social factors are relevant across every sector of life, but in the fast-paced changing healthcare world, it’s even more important as telehealth visits, online prescriptions, and other things go digital.
Host: That’s fascinating. We all know that social determinants of health are an important aspect of reaching Medicare Advantage populations—whether that’s getting them into a plan, closing care gaps, or for member engagement. How do you use your knowledge of digital inequality to improve member acquisition and outreach campaigns for health plans?
Brandon: That’s a great question. First, I want to dispel this perception that data scientists are a magic bullet. You know, people think if you have enough data, you can solve any problem. It’s not the data that solves problems; it’s the line of questions that dig down to the root of a problem. It’s human creativity that attempts to find solutions to old problems in new ways. The data is not the answer. The data supports, validates, and expedites the analytical process to find the answer.
Of course we want a robust data set to build models and predict who’s going to do something or how do we encourage a certain action. But if you’re not asking the right questions, you’re not going to get the right answers. And the other problem is the potential for ethical issues and bias in the data. AI is primarily based on historical data, so you’re always predicting people’s behavior on past trends. What that bakes into the algorithm that you’re building is the potential for ethical issues and bias.
There's evidence of this in healthcare algorithms. In one particular study, an algorithm for hospitals was built to identify patients who need high-risk care management. The algorithm was built with faulty logic; it equated healthcare spending with healthcare needs, which sounded like common sense, but the problem is that Americans in higher socioeconomic classes tend to spend more on healthcare than those in lower socioeconomic classes because they simply have more to spend. While Americans, with less money and fewer resources, spend less and often forego necessary medical care such as doctor visits, preventative care, and filling prescription medications. The study uncovered that the algorithm was unintentionally and disproportionately placing healthier patients into the high-risk care management programs while leaving out the sicker patients who spent less on healthcare but had a much greater need. Now, imagine if this algorithm ran for a decade on one community, what the effect would be? How many lives would be upended without any known reason? This would cause lasting harm to the healthcare industry. It’s necessary and worth the cost to spend the time and resources to investigate and understand the nuances of algorithmic decision-making and outcomes prior to their implementation.
We can also see this present in the hiring patterns for data scientists across industries. Hiring seems to be focused on highly technical data scientists with a background in computer science. While those folks are highly trained and competent, they don’t necessarily receive the training necessary to think about how something like how health costs could be associated with someone’s socioeconomic status. And historical data may continue to further systematic issues already present in the healthcare system. Even the healthcare systems that can afford to hire the data science teams are likely coming from more affluent backgrounds and have a more affluent patient population, which then creates the data used to build those algorithms, which then biases the results of the algorithm and is limited in its usefulness to create a fair and equitable healthcare experience for all members.
Another area of bias that the healthcare industry is becoming more aware of is rural vs. urban areas. There are just more hospitals and specialized providers in urban settings. It’s harder to attract healthcare workers to sparsely populated areas. Let’s say that to account for this, we wanted to create a model that predicted the likelihood of a member using a telehealth provider. That sounds like a great idea, right? We find out which people are most likely to be users of telehealth, and then try and apply those same factors across urban and rural environments. This is a service offered by some health plans, where they work across different states, different geographies, and different demographic bases. A healthcare provider building a model like that could spend a ton of time and money and come up with a highly accurate model, but not have access to data on any of the issues related to digital inequality that are necessary to predict someone’s likelihood of actually using telehealth. So, the model is useless because it's built on faulty data. Unless the data was used to deploy the model in a different way, maybe this model creates an opportunity for the health plan to partner with an internet service provider. The health plan pays a portion of the internet fees, and the member gets telehealth visits. That’s the other side of these modeling stories and, ultimately, the story of data science. It’s not just a one-size-fits-all all, and there could be multiple ways to use one model, but we have to think and act creatively outside of the boundaries of what we think is possible. This is also why we can’t just have computer scientists building models. We need programmers, social scientists, qualitative and quantitative people, user experience thinkers, and ethics and policy-focused people. This group needs to think about the conceptual framework of the model, the outcomes, and the bias baked in from the beginning and the bias that is an output.
Host: It makes sense that the data used to feed the algorithms has to be consistent with the population you’re applying the logic to. When we talk about using data for marketing, you know, for health plan enrollment and retention efforts—what are the implications? What’s the potential harm that can occur from marketing or messaging to populations with inaccurate models?
Brandon: The problem is that you could be sending the wrong message to the wrong people. That doesn’t sound like a big deal, but let me tell you why this is a costly mistake and has ethical implications. One from the plan’s perspective, this can create a negative brand image—harming potential member’s perception of the plan and hurting annual enrollment. Obviously, this isn’t good for business. But what I worry about most in these situations is if the model is too broad, meaning the messaging speaks in a non-specific way, using general language so it applies to more people. The campaign could encourage someone to sign up for the wrong plan for their needs. This could affect the care they receive and really impact their health, their finances, their overall quality of life, and ultimately for the plan—retention. People have a ton of options in the Medicare Advantage product space, and pushing a plan that a member signs up for and ends up not liking will bring short-term value, but it won’t make a brand ambassador out of that person. Having a health plan aligned with a member’s needs is extremely important for good health outcomes. Of course, this also impacts a plan’s Star ratings and member experience, so there are business consequences, too. The take-home is that plans have an ethical responsibility to make sure their member models are correct and the logic used in data models are accurate and congruent with their target population.
Host: I think you’ve exposed something really important. You know, marketing and communications is more than just sending messages. They have real consequences that affect real people. Are there any checks and balances for making sure the data sources, the models, and the end messages are aligned?
Brandon: Massive mistakes are possible, and it proves the inherent risk present in big data science efforts. The first thing is to make sure to always deploy at a small scale. Randomized control experiments to test the effectiveness and determine if false positives or false negatives will occur. Second, it conducts qualitative interviews and focus groups, as well as examines data on a large scale to test the models' outcomes beyond their actual statistical outputs. To preempt this, every data science team needs someone who is focused on the ethics and bias of models from a behavioral science perspective. And the reason is that these professionals understand how social factors impact or bias our decision-making. What’s happened in the data science world, is there’s an imbalanced demand for data scientists, statisticians, and really technical people. But where the industry is still lacking is making it the norm to include team members who are thinking about the social ramifications of what’s going into the model and what happens when you deploy a model. The other way to prevent data bias is in the contracting phase, integrating a contingency for if the data doesn’t work. Let’s say there’s a health plan with 5 counties, and they want to expand to a new county. Well, the communications vendor or data science as a service vendor should include a contingency that says, if the data turns out to only apply to a percentage of their new counties’ population rather than the whole population, they’ll only deploy campaigns to the population they have good models for. Ethics should be king, not the contract value. Going back to your question, the other way to ensure good campaign outcomes for both the health plan and the member is to make sure you’ve developed a model that asks the right questions. You know, you have to aim to solve the true problem and not just address the symptoms of a problem. It’s no different than giving someone a Tylenol for a headache; we haven’t asked why the person has a headache. There might be something deeper to uncover. Data science is the same way—we have to consider how we select the metrics and how those metrics influence behavior.
As an example, let’s say a health plan wants to close medication gaps. So we look at income, education and other demographic information because we think it’s going to have an effect on whether someone is likely to fill their prescription. So we run a digital outreach campaign because we think it’s going to have a big impact and close a bunch of gaps. That might work ever-so-slightly, but if we step back and think about the outcome. The outcome is closing gaps but what we haven’t done is asked why. Why do these gaps exist, to begin with? There might be specific groups of people who might have gaps that can’t be closed by sending an email communication because there might be other needs like geographic proximity to a pharmacy. In this case, our campaigns are not going to be successful by asking why and getting to the root cause of the problem, the health plan can provide solutions that work. In the example of the pharmacy desert, the open gaps are a symptom of the bigger problem: not having nearby pharmacies. Now, the health plan knows they need to figure out how to deploy additional pharmacy resources in the area and maybe create partnerships with pharmacy providers. A digital campaign that aims to get members enrolled in an online or mail-in pharmacy program, this could be successful because the metric is correct: it’s pharmacy availability, not gaps adherence. Getting the metric correct influences everyone’s behavior from the top down. And getting members enrolled in mail-order prescriptions or digital pharmacy orders would ultimately help close medication adherence rates, which is really still the ultimate goal.
Host: With SDoH, what are the data sources that help to define the needs of populations. How do you arrive at the conclusion that there’s a pharmacy or food desert or transportation insufficiency?
Brandon: It’s definitely a challenge to find the populations with specific needs. This is where more and better data are needed, and the industry has a lot of room for growth. Health plans are administering member surveys, so self-reporting is probably an accurate source in conjunction with zip code-level data. But, zip code level data is tricky because there can be wide ranges of income and education levels. If you were to use zip code data for pharmacies—the data could suggest that there’s an adequate number of pharmacies for the population, but in reality, there could be pockets of neighborhoods with no pharmacy access. Some data scientists are drilling down to street-level data and analyzing enrollment rates of mail-in pharmacy users, which could be a potential solution.
And sometimes, the data just isn’t available. In these situations, the best path to market may not be digital. This is where boots-on-the-ground efforts hold a lot of value. Sometimes, plans have got to get into those communities and engage on personal levels. Whether that’s through agents or mixed methods, for example, let’s take the Medicare Advantage marketing space. There are so many plan options available to people and the number of 4 or 5-Star plans have doubled this year. How does a retiree make a decision about what Medicare Advantage plan to choose? What communication channels are best to engage with that person to market plans to them? Big data is just one piece of that puzzle. There could be barriers that that person experienced during their career, which means they will never choose a certain Medicare Advantage plan. So, to understand that and the other barriers or opportunities to market to potential enrollees will require a quantitative data approach and a qualitative approach. Perhaps, we use agent data to understand what plans agents sell the most. Then, plans can use interviews and focus groups with those agents to understand what’s working, why, and how to employ that. Then, develop a product that’s based on the quantitative and qualitative data and deploy it at a small scale, see what happens. If it’s showing success, scale it up.
The take-home message is whether you’re building a healthcare algorithm or a marketing algorithm, the data needs to be looked at from all angles, quantitative and qualitative, and question your assumptions to get to the root cause of the problem you’re attempting to solve. Test it and continue to challenge the foundation of the algorithm.
Host: Brandon, you’re right. Algorithms aren’t magic; they’re built by humans, so it’s going to happen where the results aren’t what we expect. So, I appreciate that mindset of continually reviewing and challenging the technology.
Brandon: Exactly. That’s the ethical side of data science, and we, the industry, definitely need to focus more on it to be good stewards of data, technology, and the people it impacts.
Host: This has been really insightful, and it’s a powerful message that the healthcare industry needs to hear. Brandon, thank you so much for joining today.
Brandon: Thanks, I’ve enjoyed it.
Host: To all our listeners, thank you for listening in. Don’t forget to follow the podcast on Apple or Spotify, rate it, and share it with your colleagues. See you next time.