From Explanation to Prediction: How Data is Reshaping Psychology

For most of its history, psychology has been a discipline of explanation. We observe behavior, we build theories, we try to understand why people think, feel, and act the way they do. And that work matters enormously – it always will. But something is changing. Quietly, and then not so quietly, psychology is becoming a science of prediction.

This shift didn’t happen overnight. It emerged from the collision of two forces: the explosion of digital data about human behavior, and the development of computational methods powerful enough to find patterns in that data that no human researcher could spot on their own. The result is a discipline that is starting to look very different from the one most of us were trained in.

The Moment Things Changed

If I had to point to a single study that marked the turning point, it would be Michal Kosinski’s research on Facebook likes and personality prediction. The finding was striking: a computational model, trained on nothing more than a person’s Facebook likes, could predict their Big Five personality traits more accurately than their friends, family, and even their spouse could.

Think about that for a moment. An algorithm, fed digital footprints – not questionnaire responses, not clinical interviews, not hours of observation – outperformed the people who know you best.

That study didn’t just make headlines. It forced the field to reckon with a fundamental question: what happens when we move from explaining behavior after the fact to predicting it before it happens? And it opened a door that has only grown wider since.

It Wasn’t a One-Off

Kosinski’s study was the moment many people woke up, but it wasn’t an isolated case. The evidence that digital traces can reveal deep psychological truths has only accumulated since then.

One study that I find particularly striking came from researchers analyzing language on Reddit’s relationship subreddits. They found that changes in a user’s language patterns – subtle shifts in word choice, sentence structure, and emotional tone – could predict a breakup roughly a month before it actually happened. The users themselves likely had no conscious awareness that their language was changing, but the data saw it coming.

That’s the power of this new paradigm. It doesn’t just confirm what we already suspect – it reveals patterns that are invisible to the people living them. When language becomes a predictor of relationship dissolution, when social media activity becomes a window into personality structure, when typing patterns can indicate cognitive decline – we’re not in the world of traditional psychology anymore. We’re in a world where behavior leaves digital traces, and those traces contain more information than we ever imagined.

From Why to What’s Next

Traditional psychology asks: why did this person behave this way? Predictive psychology asks: given what we know, what will this person likely do? The difference isn’t just methodological – it’s philosophical. It changes the role of the psychologist from interpreter to forecaster, and it demands a completely different toolkit.

Explanatory research thrives on small, carefully controlled samples and theory-driven hypotheses. A researcher formulates a theory, designs an experiment, collects data from a few hundred participants, and tests whether the results support the hypothesis. This approach has given us everything from attachment theory to cognitive behavioral therapy. It is rigorous, elegant, and deeply important.

Predictive research operates differently. It thrives on large datasets, machine learning, and pattern recognition at scale. The researcher doesn’t always start with a theory. Sometimes the data leads. Sometimes patterns emerge that no existing framework anticipated. This approach is opening doors that were simply closed before – in mental health screening, in education, in organizational behavior, in public policy.

Big data didn’t just give us more data. It gave us a different kind of psychology.

And I want to be clear: these two approaches are not in opposition. The best work in the field will come from researchers who can move between both – who can generate predictions from data and then build explanatory frameworks to understand why those predictions work. But the balance is shifting, and if we’re not prepared for that shift, we’ll be left behind.

The Gap in Georgia

I think about this shift constantly, because in Georgia, we’re still largely in the explanatory era. And I don’t say that as a criticism – it’s a reality shaped by several interconnected challenges.

The first is data availability. Despite a significant amount of research being conducted in the country, there are almost no public datasets. Public research organizations conduct polls and surveys on a wide range of topics, but the data rarely becomes available to other researchers. Open science – the principle that data, methods, and findings should be freely shared – is not yet common practice here.

There are exceptions. The Caucasus Research Resource Centers (CRRC) shares its survey data publicly, and it’s an invaluable resource. And data from international large-scale assessments like PISA, TIMSS, and PIRLS is available because the organizations that conduct them internationally – the OECD and IEA – make data sharing a requirement. I’ve built much of my own research career on these datasets precisely because they’re accessible when so little else is.

But beyond these exceptions, if you’re a researcher in Georgia who wants to do secondary data analysis – who wants to work with existing datasets to ask new questions – you will hit walls quickly. And without accessible data, it’s extremely difficult to train the next generation of researchers to work at scale.

The second challenge is technical skills. Working with large datasets requires proficiency in tools like R and knowledge of advanced statistical methods – Item Response Theory, structural equation modeling, machine learning, text analysis. These aren’t part of most psychology curricula in Georgia yet. Students graduate knowing SPSS and t-tests, but the world has moved far beyond that.

And then there’s mindset. Explanatory research is familiar, well-established, and, frankly, easier. Running a survey, performing a t-test, and interpreting results is a well-worn path. There’s comfort in it. Predictive modeling requires comfort with ambiguity, iterative thinking, and a willingness to let the data speak before imposing theory on it. That’s a different kind of intellectual posture, and it takes time to develop.

Why I Built This Program

This is exactly why I created the Psychology in the Digital World program at BTU. Not because explanatory psychology is obsolete – it isn’t – but because the gap between where Georgia stands and where the field is heading globally needs to be closed. And the only way to close it is through education.

The program is designed to accelerate this transition. Students learn research methodology alongside R programming. They study psychometrics alongside digital behavior. They engage with PISA, PIRLS, and TIMSS data – real, large-scale datasets – not just textbook examples. The goal is to produce graduates who are comfortable with data, who understand measurement, and who can think in terms of prediction as well as explanation.

It will take time. Shifting a field’s orientation doesn’t happen in a single semester or even a single generation of graduates. But every student who learns to work with data at scale, who embraces open science, who thinks predictively – that’s progress.

Bridging the Gap with Tools

Part of the challenge is that statistical concepts are genuinely hard to teach, especially to students with no prior quantitative background. When I was learning statistics myself, and now when I teach it, one thing is consistently clear: without visualization, many concepts remain abstract and inaccessible.

That’s why I built the Statistical Concepts Explorer (SCE) – an interactive R Shiny application with 65 modules covering everything from basic descriptive statistics to advanced psychometric models like IRT and structural equation modeling. The tool is designed primarily for educators, to make it easier to show students what a normal distribution actually looks like, how sample size affects confidence intervals, what happens when you violate assumptions of a test.

SCE doesn’t solve the skills gap on its own. But it addresses one piece of the puzzle: if we can make statistical concepts more intuitive and visual, we lower the barrier to entry. A student who can see how IRT item characteristic curves change with different discrimination parameters is much closer to understanding the concept than one who only reads about it in a textbook.

Tools like this are part of the infrastructure we need. Not just better curricula, but better instruments for learning.

What Students Think

When I present these ideas in the classroom – that an algorithm can predict personality from Facebook likes, that language patterns on Reddit can forecast breakups – the reactions are mixed. Some students are genuinely fascinated. They see the possibilities, they want to know how it works, they start asking questions about what else could be predicted.

Others are less moved. And I understand that. Many students come to psychology because they want to help people. They’re drawn to counseling, to therapy, to understanding human suffering. The idea that they also need to learn R or understand regression models can feel like an unwelcome detour from what they signed up for.

But here’s the thing I always try to convey: there is no helping people without evidence, and there is no evidence without data. You don’t need to become a programmer. But if you can’t read a study critically, if you can’t evaluate whether a therapeutic intervention actually works, if you can’t tell the difference between a well-designed study and a poorly designed one – then you’re not practicing psychology. You’re practicing something else.

Without evidence from data, psychology becomes pseudoscience. And pseudoscience doesn’t help anyone.

The students who grasp this – who see that data skills and human compassion are not in conflict but are deeply complementary – those are the ones who will move the field forward.

What Success Looks Like

If I fast-forward five or ten years and imagine what success looks like for this program’s graduates, I don’t see them all working in traditional psychology roles. I see them crossing into other fields – education policy, UX research, health informatics, data science, organizational development – carrying psychological thinking into domains that desperately need it.

I see them sharing their data openly. I see them running predictive studies alongside explanatory ones. I see them building tools that make psychological measurement more accessible, more rigorous, and more useful. I see them contributing to a culture of open science that doesn’t yet exist in Georgia but could, if enough people commit to it.

Most importantly, I see them using data not as an end in itself, but as a means to understand and improve human lives. Because that’s what this shift is really about. The algorithms, the datasets, the statistical models – these are just instruments. The science – the real science – is still about people.

The question is not whether psychology will become more data-driven. It already is. The question is whether we’ll be ready for it – whether our students, our institutions, and our research culture will adapt fast enough to participate in this transformation rather than simply watch it happen elsewhere.

That’s the question I’m trying to answer, one cohort at a time.

—
Giorgi Tchumburidze
April 2026