This project focused on analyzing sales data from basic basket products to uncover commercial risks and revenue opportunities. Using a full ETL pipeline — SQL for extraction, Python for transformation, and Power BI for visualization — we built a comprehensive analytical model that connected data to strategy. Through dynamic dashboards and automated insights, the analysis provided a clear picture of declining trends, product concentration, and customer behavior, serving as a data-driven foundation for decision-making.
This stage involved querying the tickets
table to gather strategic insights, moving from a general understanding of total sales to more detailed product and customer behavior analyses.
SELECT * FROM tickets LIMIT 100;
Key metric: Total Revenue Generated
SELECT SUM(precio_total) AS total_revenue FROM tickets;
To assess revenue evolution over time:
SELECT strftime('%Y-%m', fecha) AS month, SUM(precio_total) AS monthly_revenue
FROM tickets
GROUP BY month
ORDER BY month;
Breakdown of revenue by department:
SELECT id_departamento, SUM(precio_total) AS department_sales
FROM tickets
GROUP BY id_departamento
ORDER BY department_sales DESC;
Sales by product section:
SELECT id_seccion, SUM(precio_total) AS section_sales
FROM tickets
GROUP BY id_seccion
ORDER BY section_sales DESC;
Top 10 products by quantity sold:
SELECT nombre_producto, SUM(cantidad) AS total_sold
FROM tickets
GROUP BY nombre_producto
ORDER BY total_sold DESC
LIMIT 10;
Top 10 products by revenue:
SELECT nombre_producto, SUM(precio_total) AS product_revenue
FROM tickets
GROUP BY nombre_producto
ORDER BY product_revenue DESC
LIMIT 10;
Top 20 clients by revenue:
SELECT id_cliente, SUM(precio_total) AS client_revenue
FROM tickets
GROUP BY id_cliente
ORDER BY client_revenue DESC
LIMIT 20;
Average spend per client:
SELECT AVG(client_total_purchase) AS avg_per_client
FROM (
SELECT id_cliente, SUM(precio_total) AS client_total_purchase
FROM tickets
GROUP BY id_cliente
) AS subquery;
Total number of orders:
SELECT COUNT(DISTINCT id_pedido) AS total_orders FROM tickets;
Average order value:
SELECT AVG(order_total) AS avg_order_value
FROM (
SELECT id_pedido, SUM(precio_total) AS order_total
FROM tickets
GROUP BY id_pedido
) AS subquery;
After executing SQL queries, Excel was used as a workspace to record and organize the query results. This allowed for a clear visualization of the data outputs before building the dashboard and writing the report.
Once the SQL queries were validated and the metrics were clearly documented in Excel, the next step was to create an interactive dashboard in Power BI, connecting directly to the SQLite database via an ODBC (Open Database Connectivity) driver.
Following the development of the dashboard, a strategic report was created to communicate insights clearly to the company’s executive board. The report was structured into three key parts:
The business generated a total of €39,854,875.32 in revenue over the year.
However, the analysis revealed a significant downward trend in monthly revenue, highlighting urgent challenges in the company’s current sales strategy.
Key Insight:
Revenue declined by 82% from January to December, signaling a critical need for diversification and strategic reorientation.
Main Risks Identified:
In this stage, I implemented a custom Market Basket Analysis pipeline in Python focused on product co-occurrence, with a strong emphasis on direct database integration, data transformation, and the development of logic-based metrics (confidence and lift) — all tailored to real transactional data.
A key part of this process was establishing a live connection to the company’s transactional database using the sqlite3
library. This allowed direct querying and retrieval of the full tickets
table — containing information on orders, products, departments, and sections — without relying on intermediate file exports (e.g., CSVs or Excel).
Once the data was in Python, I reshaped the dataset to simulate individual shopping carts. This was done by grouping products by order ID
and creating a structured format that lists all products purchased together.
To analyze product relationships, the grouped data was converted into a binary matrix using one-hot encoding (get_dummies
). In this matrix:
Rather than using prebuilt association libraries, I developed custom functions to calculate two core metrics of market basket analysis:
To make the insights business-relevant, the resulting associations were merged with product metadata (product ID, section, and department). The final output was exported as a CSV file for further use in reports or dashboards.
This Python-based solution allowed the company to:
After the Market Basket Analysis was completed in Python, the enriched dataset — containing association rules with product metadata — was integrated into Power BI for visual exploration.