Unsupervised Learning in FinTech

Mathematical Foundations with Conceptual Interpretation

1. Introduction

In previous discussions on supervised learning, we considered problems where a target variable $Y$ is known. The objective was to learn a mapping:

f: X \rightarrow Y

where $X$ represents input variables and $Y$ represents known outcomes such as loan default, fraud occurrence, or asset returns.

However, many real-world financial datasets do not come with labeled outcomes. A bank may possess millions of customer records containing income, spending behavior, and transaction history—but no explicit label indicating customer category. Similarly, a trading firm may observe stock returns but may not have predefined “market regime” labels.

In such cases, prediction is not the immediate objective. Instead, the goal is structure discovery. This is the domain of Unsupervised Learning.

2. Definition of Unsupervised Learning

Mathematical Representation

Given a dataset:

D = \{x_1, x_2, \dots, x_n\}

where each observation $x_i \in \mathbb{R}^p$ , unsupervised learning seeks to uncover hidden structure within the feature space.

Unlike supervised learning:

{No target variable } Y

There is no loss function comparing predicted and actual outputs. Instead, the structure must be inferred directly from the geometry or distribution of the data.

Conceptual Explanation

In simple terms, unsupervised learning answers the question:

“How are the data points related to each other?”

It does not predict outcomes.
It organizes and reveals patterns.

In FinTech applications, this is particularly useful when:

Customer segments are unknown.
Fraud patterns are emerging.
Market regimes are shifting.

Unsupervised learning becomes a tool for financial discovery rather than financial prediction.

3. Clustering: The Core Unsupervised Technique

Clustering is the most widely used unsupervised method in financial analytics.

Mathematical Objective

Given data:

X = {x_{1}, x_{2}, . . ., x_{n}}

We partition it into $K$ clusters:

C_{1}, C_{2}, . . ., C_{K}

such that:

$C_{i} \cap C_{j} = \emptyset$
$⋃_{k = 1}^{K} C_{k} = X$

Interpretation

This formal definition ensures:

Each data point belongs to exactly one cluster.
All data points are assigned.
Clusters do not overlap.

The objective is to group similar financial entities together while separating dissimilar ones.

For example:

Customers with similar spending patterns form a segment.
Stocks with correlated returns form a risk cluster.

4. Measuring Similarity

Clustering depends critically on how we define similarity.

4.1 Euclidean Distance

d (x_{i}, x_{j}) = \sqrt{\sum_{k = 1}^{p} (x_{i k} - x_{j k})^{2}}

This measures straight-line distance between two observations in $p$ -dimensional space.

Explanation

Euclidean distance assumes:

Features are numeric
Variables are properly scaled

In financial datasets, this could measure similarity between two customers based on income, spending, and transaction frequency.

If two customers have similar feature values, their Euclidean distance will be small, and clustering algorithms will likely group them together.

4.2 Manhattan Distance

d (x_{i}, x_{j}) = \sum_{k = 1}^{p} ∣ x_{i k} - x_{j k} ∣

Explanation

Manhattan distance measures movement along axes rather than straight-line distance. It is often more robust to extreme values and may perform better in certain financial risk datasets where outliers exist.

4.3 Cosine Similarity

Cosine Similarity = \frac{x_{i} \cdot x_{j}}{∥ x_{i} ∥ ∥ x_{j} ∥}

Explanation

Cosine similarity measures similarity in direction rather than magnitude.

In finance, this is especially useful when analyzing:

Stock return series
Portfolio movement patterns

Two stocks may have different return magnitudes but move in the same direction. Cosine similarity captures this directional similarity.

5. K-Means Clustering

K-Means is the most widely used clustering algorithm in financial applications.

5.1 Objective Function

J = \sum_{k = 1}^{K} \sum_{x_{i} \in C_{k}} ∥ x_{i} - μ_{k} ∥

where:

$\mu_k$ is the centroid of cluster $k$

Interpretation

This function measures the total squared distance of each point from its cluster center.

The algorithm attempts to minimize this value.

Minimizing $J$ ensures:

Points within a cluster are close to the centroid.
Clusters are compact.

In financial segmentation, this means customers in the same group have similar characteristics.

5.2 Centroid Update Rule

\m

Explanation

The centroid is simply the mean of all points in a cluster.

Each iteration:

Assign points to nearest centroid.
Recompute centroid.
Repeat until convergence.

Convergence occurs when cluster assignments stabilize.

5.3 Financial Application

Suppose a bank wants to divide customers into 4 groups based on:

Income
Spending
Credit utilization
Transaction frequency

K-Means may produce clusters such as:

Premium customers
Conservative savers
Credit-dependent customers
Emerging affluent group

This segmentation supports targeted marketing strategies and personalized financial offerings.

6. Limitations of K-Means

K-Means assumes:

Clusters are spherical
Similar cluster sizes
Minimal noise

Financial data often violates these assumptions.

For example:

Fraud transactions are rare and irregular.
Risk clusters may have complex shapes.

In such cases, alternative clustering techniques are needed.

7. Hierarchical Clustering

Hierarchical clustering builds a nested tree-like structure called a dendrogram.

7.1 Distance Between Clusters

Single linkage:

d (A, B) = \min d (x, y)

Complete linkage:

d (A, B) = \max d (x, y)

Average linkage:

d (A, B) = \frac{1}{∣ A ∣ ∣ B ∣} \sum d (x, y)

Explanation

These formulas define how we measure distance between groups rather than individual points.

In portfolio management:

Hierarchical clustering reveals how stocks group together.
It identifies sectors and sub-sectors automatically.

This helps construct diversified portfolios by avoiding concentration risk.

8. DBSCAN

DBSCAN stands for Density-Based Spatial Clustering of Applications with Noise.

Core Concept

Clusters are dense regions of points separated by sparse regions.

Two parameters:

$\varepsilon$ : neighborhood radius
MinPts: minimum neighbors

Points are classified as:

Core points
Border points
Noise points

Financial Interpretation

Fraud transactions:

Do not follow regular patterns
Are isolated
Appear as anomalies

DBSCAN naturally identifies:

Dense clusters of normal transactions
Sparse anomalies (potential fraud)

Unlike K-Means, DBSCAN does not require specifying the number of clusters.

9. Why Unsupervised Learning is Critical in FinTech

Unsupervised learning enables:

Discovery of customer segments without labels
Detection of emerging fraud schemes
Identification of hidden portfolio structures
Behavioral clustering for personalization

In many financial systems, labels emerge only after damage occurs. Unsupervised learning allows early detection before formal classification is possible.

10. Conclusion

Supervised learning predicts outcomes:

f (X) \to Y

Unsupervised learning reveals structure within:

X

Clustering methods such as K-Means, Hierarchical Clustering, and DBSCAN provide mathematical tools for discovering patterns in complex financial data.

In the FinTech ecosystem, where data grows exponentially and patterns evolve dynamically, unsupervised learning plays a foundational role in enabling strategic intelligence, risk discovery, and behavioral insight.

✍️ Author’s Note

This blog reflects the author’s personal point of view — shaped by 25+ years of industry experience, along with a deep passion for continuous learning and teaching.
The content has been phrased and structured using Generative AI tools, with the intent to make it engaging, accessible, and insightful for a broader audience.

Search This Blog

Tech to Transform

44 GenAI in Banking & Finance : Unsupervised Learning in FinTech

Unsupervised Learning in FinTech

Mathematical Foundations with Conceptual Interpretation

1. Introduction

2. Definition of Unsupervised Learning

Mathematical Representation

Conceptual Explanation

3. Clustering: The Core Unsupervised Technique

Mathematical Objective

Interpretation

4. Measuring Similarity

4.1 Euclidean Distance

Explanation

4.2 Manhattan Distance

Explanation

4.3 Cosine Similarity

Explanation

5. K-Means Clustering

5.1 Objective Function

Interpretation

5.2 Centroid Update Rule

Explanation

5.3 Financial Application

6. Limitations of K-Means

7. Hierarchical Clustering

7.1 Distance Between Clusters

Explanation

8. DBSCAN

Core Concept

Financial Interpretation

9. Why Unsupervised Learning is Critical in FinTech

10. Conclusion

Comments

Post a Comment

Popular posts from this blog

01 - Why Start a New Tech Blog When the Internet Is Already Full of Them?

07 - Building a 100% Free On-Prem RAG System with Open Source LLMs, Embeddings, Pinecone, and n8n

19 - Voice of Industry Experts - The Ultimate Guide to Gen AI Evaluation Metrics Part 1