Advanced Persistent Talent

ThreatConnect’s Bayesian Expert: It’s Time to Look Beyond Cool Data Models

Posted April 18, 2025

The Advanced Persistent Talent series profiles ThreatConnect employees and explores how their work impacts products and offerings, how they got here, and their views on the industry at large. Want to know more about a particular team? Let us know!

Bayesian analysis is a statistical method where you start with a “prior belief” about something, then update that belief as new data becomes available. It’s a method of data analysis that’s growing in popularity—especially in machine learning spheres—because it lets data scientists model uncertainty, incorporate prior knowledge, and incorporate new evidence.

In fact, ThreatConnect has its own Bayesian expert, Senior Data Science Manager John Snyder. Before joining ThreatConnect, John Snyder spent seven years studying Bayesian analysis for his PhD.

Today, Snyder works on CAL™, ThreatConnect’s data analysis platform that uses generative AI, natural language processing, and machine learning (ML) to surface critical threat intel insights for customers.

If you’ve ever wondered about some of the human minds working behind the scenes of ThreatConnect’s algorithms, read on to learn more about how Snyder found himself in cybersecurity, what he works on day to day, and what he thinks about our AI-influenced future.

The following conversation has been edited for clarity and length.

ThreatConnect: After earning your PhD, you worked at Monsanto as a data scientist. How did you end up in cybersecurity?

John Snyder: For a PhD in statistics, generally you’re creating some methodology. Then, you have to study your method—study it under simulation. And those simulations are a computer programming exercise. And I mean, it’s millions and millions of little calculations that have to be done. And it’s too much for my little laptop, right?

So I bought a server rack, I bought servers, I bought networking equipment, and I basically built my own mini data center to do my simulations on.

I started to get really interested in network stuff, in cybersecurity around that time—figuring out how to make everything work together. And so from that point, it became a budding interest that kept developing in the background as I started my career.

What are some of the biggest differences between working in agriculture and the cybersecurity world?

It’s much more practical because everything in cyber is observational. We’re observing a never-ending fire hose of information. Whereas in agriculture, it’s much more controlled and optimized for the specific thing that they’re looking for.

You’re a data science manager at ThreatConnect. What does that entail?

I work on the CAL team. CAL stands for the collective analytics layer—it’s essentially ThreatConnect’s big data platform. My job as a data scientist working in this wing of the organization is to take that mountain of data that we have and turn it into insights for the customers who use this service—insights that they wouldn’t be able to get if they were only looking at their own isolated window of data.

Want to see CAL in action? Take a free tour here.

What are some of the most interesting challenges you’ve worked on to date?

Often, a data scientist will tend to isolate. They have data and they’re like, “All right, I’m going to build the best model for this data.”

What I personally find incredibly interesting is knowing that ThreatConnect does a ton of stuff and I am a piece of that. I’m trying to enhance or provide new insights with the data but I need to make sure that my solutions are good and fit within the existing framework of the technology that we use.

It allows me to flex my statistical consulting brain because I’ll speak with stakeholders about their dreams for data, what insights they want to extract. I’ll also work with engineers to say, “The solution I’m building—is this something we can operationalize and make it work on the billions of indicators that we have in CAL?”

The balancing act there is super interesting. I can’t just build the biggest deep learning model, because it has to run multiple times a day, and it’s not going to work right in the framework. So sometimes I have to pare things down a bit. Other times, when I don’t have those restrictions, I can scale it up and provide a very big solution.

Is there anything at ThreatConnect that you wish more people knew about?

A lot of people when they develop solutions in my kind of role, want their predictions to stand out. You want them to look cool.

But our philosophy, and the philosophy that was instilled in me when I joined and started working on problems, was we only want to alert the customer to something going wrong if we’re damn sure that something is wrong.

So we take a very conservative approach to the information that we present to customers. It’s not flashy, so we don’t advertise it or shout it from the mountaintops because it’s not, quote, unquote, sexy to say that kind of thing.

What advice would you give to companies that want to better leverage data science in cybersecurity?

You should look at the models that you have at your disposal, whether it’s an AI model—like a large language model or an XGBoost model—or a linear regression. You should look at these as tools that you’re using to build something, but the focus should be on what you are trying to build. The focus should be on the problems that you’re trying to solve.

Data scientists and organizations, they’ll work on a problem and it will never go into production. And that’s the majority of projects, right?

I do not have that problem here at ThreatConnect because I spend a lot of time speaking to project managers, making sure that the solution I’m developing is going to answer questions and will be able to be productized.

People need to be very sure, very careful, that they’re hiring people, and they have teams in place that are focusing on solving problems, not about what cool models they’re using to solve the problems.

What are the most interesting trends in data science and cybersecurity right now?

As far as what I find interesting and what keeps me up at night: It’s usage of AI for both.

It’s the whole thing about, “When you have a hammer, everything is a nail.” You need to take it back to first principles and figure out what problems you are actually trying to solve to effectively utilize these amazing technologies.

For example, ThreatConnect has an internal query language for our platform. We build a generator for that. You can say, “I want to make a query to get this kind of data,” and it will build that query for you to use in the platform, because who wants to become an expert on ThreatConnect query language, right? That is direct usage of AI that helps customers.

But sometimes I worry about people being a little bit too loosey-goosey with it, and not taking the appropriate precautions with guardrails to make sure the question is answering what it should be.

Like I said before, we’re conservative in what we show customers. I don’t want an LLM to make up some information and then present it to the customer— I’m on the hook for that kind of thing. Being very careful to avoid these kinds of things is something I worry that people are not being careful enough about.

In addition to working at ThreatConnect, you’re an adjunct assistant professor of statistics at the University of Missouri. How does coursework compare to work-work?

I’ve designed five courses for [the department], all of them at the graduate level, and I teach them online at night.

There’s a very beautiful circle that I have with my day job and my night job, so to speak, in that I can build practical skills and work on practical problems at work. And what the students really crave is that practical knowledge… A lot of times, professors don’t work on practical engineering problems like I do every day. So I can offer a very unique perspective to the students that they wouldn’t have otherwise.

On the other side of the circle, this also allows me to stay up to date on the most cutting-edge trends on a very deep technical level that I would not be able to do in my day job. I wouldn’t be able to say, I’m going to block my calendar from one to four, and I’m going to read the mathematical details of this very esoteric, deep learning model that I just found. But I do that for a class in the evening.

Do you have any cybersecurity or data science resources that you recommend?

If you listen to the Chip Huyen Pragmatic Engineer episode, that was the best take on modern uses of AI that I’ve ever heard. In fact, she’s the first person I’ve seen who is trying to formalize the definition of AI agents. Right now, everybody’s saying agents, agent AI—but there’s no definition of that. She’s trying to formalize it.

About the Author

Sarah Cottone

Sarah is a freelance content strategist, writer, and editor for B2B tech companies. She's currently based outside of Denver.

Subscribe
to our Emails

Threat Intelligence Operations

ThreatConnect’s Bayesian Expert: It’s Time to Look Beyond Cool Data Models

ThreatConnect: After earning your PhD, you worked at Monsanto as a data scientist. How did you end up in cybersecurity?

What are some of the biggest differences between working in agriculture and the cybersecurity world?

You’re a data science manager at ThreatConnect. What does that entail?

What are some of the most interesting challenges you’ve worked on to date?

Is there anything at ThreatConnect that you wish more people knew about?

What advice would you give to companies that want to better leverage data science in cybersecurity?

What are the most interesting trends in data science and cybersecurity right now?

In addition to working at ThreatConnect, you’re an adjunct assistant professor of statistics at the University of Missouri. How does coursework compare to work-work?

Do you have any cybersecurity or data science resources that you recommend?

Sarah Cottone

Subscribe
to our Emails

CAL™ ATL: Collecting and Analyzing Open Source Intel Faster and Easier

Elevate Your Threat Intel with CAL™ Feeds and AI-Powered Insights

Enhancing Cybersecurity with CAL™ Automated Threat Library (ATL) Industry Classification

ThreatConnect’s Bayesian Expert: It’s Time to Look Beyond Cool Data Models

ThreatConnect: After earning your PhD, you worked at Monsanto as a data scientist. How did you end up in cybersecurity?

What are some of the biggest differences between working in agriculture and the cybersecurity world?

You’re a data science manager at ThreatConnect. What does that entail?

What are some of the most interesting challenges you’ve worked on to date?

Is there anything at ThreatConnect that you wish more people knew about?

What advice would you give to companies that want to better leverage data science in cybersecurity?

What are the most interesting trends in data science and cybersecurity right now?

In addition to working at ThreatConnect, you’re an adjunct assistant professor of statistics at the University of Missouri. How does coursework compare to work-work?

Do you have any cybersecurity or data science resources that you recommend?

Sarah Cottone

Subscribe to our Emails

CAL™ ATL: Collecting and Analyzing Open Source Intel Faster and Easier

Elevate Your Threat Intel with CAL™ Feeds and AI-Powered Insights

Enhancing Cybersecurity with CAL™ Automated Threat Library (ATL) Industry Classification

Subscribe
to our Emails