SQL Linear Recursion to Find Cheapest Airline Connections
By Ashutosh Malgaonkar
Table of Contents:
- The Problem Statement
- How the Data Looks
- Understanding the Problem with the Data
- How Recursion is Implemented in PostgreSQL
- Solution Implementation
I. The Problem Statement
COMPANY X employees are trying to find the cheapest flights to upcoming conferences.
When people travel long distances, a direct flight between two cities is often more expensive than taking two flights with a layover. Travelers might save even more by breaking the trip into three flights with two stops. However, we assume no traveler will make more than two stops on a journey.
The flight data is stored in a table with the following columns:
- id – The unique ID of the flight
- origin – The origin city of the flight
- destination – The destination city of the flight
- cost – The cost of the flight
The objective is to generate a trips table containing the cheapest possible travel options between all origin-destination pairs.
II. How the Data Looks
The data includes flights from various origins to multiple destinations, often with varying costs. For example:
ID | Origin | Destination | Cost |
---|---|---|---|
1 | New York | Chicago | $100 |
2 | Chicago | Los Angeles | $150 |
3 | New York | Los Angeles | $300 |
4 | Los Angeles | Seattle | $120 |
5 | New York | Seattle | $400 |
Using recursion, we aim to find the cheapest way to travel between these cities.
III. Understanding the Problem with the Data
Single-flight options are straightforward, but multiple layovers introduce complexity. A traveler flying from New York to Los Angeles could take either:
- Direct flight: $300
- Layover flight through Chicago: $100 + $150 = $250
In this case, the second option is the cheapest, and we must identify such routes dynamically using SQL recursion.
IV. How Recursion is Implemented in PostgreSQL
PostgreSQL supports recursion through Common Table Expressions (CTEs). A basic recursion example to print numbers from 1 to 10 is:
WITH RECURSIVE counter AS ( SELECT 1 AS n UNION ALL SELECT n + 1 FROM counter WHERE n < 10 ) SELECT * FROM counter;
The recursive query starts with an initial value (1) and continues incrementing until the condition (n < 10) is met.
V. Solution Implementation
We can use a similar recursive approach to find the cheapest airline connections.
Step 1: Creating a Recursive Query
WITH RECURSIVE cheapest_routes AS ( SELECT id, origin, destination, cost FROM flights UNION ALL SELECT NULL, r.origin, f.destination, r.cost + f.cost FROM cheapest_routes r JOIN flights f ON r.destination = f.origin WHERE r.cost + f.cost < 500 ) SELECT origin, destination, MIN(cost) AS cheapest_cost FROM cheapest_routes GROUP BY origin, destination;
This query:
- Starts with direct flights.
- Joins routes recursively, adding costs.
- Limits total cost to $500 to prevent infinite loops.
- Finds the minimum cost per route.
Step 2: Optimizing the Query
To improve performance, we can add an index on the origin
and destination
columns:
CREATE INDEX idx_flights ON flights (origin, destination);
This speeds up joins and enhances query efficiency.
Step 3: Output the Cheapest Routes
Running the query produces a table of optimized flights:
Origin | Destination | Cheapest Cost |
---|---|---|
New York | Los Angeles | $250 |
New York | Seattle | $370 |
Chicago | Seattle | $270 |
Conclusion
Using SQL recursion in PostgreSQL allows us to efficiently find the cheapest airline connections by dynamically exploring multi-stop routes. This approach helps optimize travel costs and provides valuable insights for organizations searching for budget-friendly travel solutions.