Brian Achaye
Brian Achaye

Data Scientist

Data Analyst

ODK/Kobo Toolbox Expert

BI Engineer

Data Solutions Consultant

Brian Achaye

Data Scientist

Data Analyst

ODK/Kobo Toolbox Expert

BI Engineer

Data Solutions Consultant

Articles

How I Use ChatGPT to Supercharge My Data Analysis Workflow

How I Use ChatGPT to Supercharge My Data Analysis Workflow

As a data analyst, I’m always looking for ways to work smarter—not harder. And while Python, SQL, and Excel are my core tools, I’ve found that ChatGPT is a game-changer for speeding up repetitive tasks, debugging code, and even brainstorming analysis approaches.

But here’s the catch: ChatGPT won’t replace data analysts—it just makes us faster. Used correctly, it can help with data cleaning, SQL query generation, and even explaining complex statistical concepts. Used poorly, it can lead to hallucinated formulas or incorrect logic.

In this post, I’ll share exactly how I integrate ChatGPT into my workflow, along with real examples, limitations, and best practices.

1. Writing and Debugging SQL Queries Faster

Writing complex SQL queries can be time-consuming, especially when dealing with multiple joins or window functions. Instead of scouring Stack Overflow, I now use ChatGPT to:

Generate SQL snippets (e.g., “Write a SQL query to find the 30-day rolling average of sales by customer.”)
Optimize slow queries (e.g., “Why is this PostgreSQL query slow? [paste query]”)
Explain tricky SQL concepts (e.g., “What’s the difference between RANK() and DENSE_RANK()?”)

Example Prompt:

“Write a PostgreSQL query to find customers who made purchases in the last 30 days but not in the previous 30 days.”

ChatGPT generates a clean query with a LEFT JOIN and WHERE clause—saving me 10-15 minutes of trial and error.

⚠️ Watch Out For:

  • ChatGPT sometimes uses outdated syntax (e.g., MySQL vs. PostgreSQL differences).
  • Always test the query on a small dataset first!

2. Automating Data Cleaning with Python

Data cleaning is tedious. Instead of manually writing Pandas code for every transformation, I ask ChatGPT:

“How do I remove duplicates but keep the latest record in a Pandas DataFrame?”
“Write Python code to impute missing values based on the median of each group.”

Example Output:

# ChatGPT-generated code to handle missing values  
df['age'] = df.groupby('department')['age'].transform(lambda x: x.fillna(x.median()))  

This is usually 80% correct, but I often tweak it for edge cases.

🔗 Pro Tip: Pair ChatGPT with Python’s pandas-profiling to quickly identify cleaning needs.

3. Explaining Statistical Concepts in Plain English

Ever struggled to explain p-values or confidence intervals to a non-technical stakeholder? I use ChatGPT to:

Simplify jargon (e.g., “Explain linear regression like I’m 10.”)
Generate analogies (e.g., “What’s a real-world example of Bayes’ Theorem?”)

Example:

“Explain multicollinearity in regression without using math terms.”
ChatGPT responds: “Imagine two weather forecasters who always say the same thing. If one says it’ll rain, the other does too. In regression, this means we can’t tell which predictor actually matters.”

📌 Caution: Always fact-check explanations—ChatGPT can oversimplify or be wrong.

4. Brainstorming Analysis Approaches

Sometimes, I hit a mental block on how to approach a problem. Instead of staring at a blank Jupyter notebook, I ask:

“What are three ways to analyze customer churn data?”
“What Python libraries are best for geospatial analysis?”

ChatGPT suggests ideas I might not have considered, like survival analysis for churn or using geopandas for mapping.

5. The Risks & How to Use ChatGPT Responsibly

While ChatGPT is powerful, it has serious limitations for data work:

It makes up fake formulas (I once caught it inventing a non-existent sklearn.metrics.precision_score).
It can’t verify data accuracy (e.g., suggesting incorrect aggregations).
It lacks domain context (e.g., not knowing your business rules).

My Golden Rules:

Always validate outputs (run small-scale tests first).
Use it for drafts, not final answers (like a coding assistant).
Never input sensitive data (privacy risks!).

Final Thoughts: ChatGPT as a Sidekick, Not a Replacement

I don’t use ChatGPT to do my analysis—I use it to accelerate the boring parts. It’s like having a junior analyst who’s great at Googling but sometimes gets things wrong.

Have you tried ChatGPT for data work? Let me know your favorite use cases!

Related Posts
Write a comment