this post was submitted on 10 Dec 2023
10 points (100.0% liked)
Aotearoa / New Zealand
1657 readers
4 users here now
Kia ora and welcome to !newzealand, a place to share and discuss anything about Aotearoa in general
- For politics , please use !politics@lemmy.nz
- Shitposts, circlejerks, memes, and non-NZ topics belong in !offtopic@lemmy.nz
- If you need help using Lemmy.nz, go to !support@lemmy.nz
- NZ regional and special interest communities
Rules:
FAQ ~ NZ Community List ~ Join Matrix chatroom
Banner image by Bernard Spragg
Got an idea for next month's banner?
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
There's a few different reasons that I've though about for now:
A lot of the data that we are working with is quite large, and it's sometimes a struggle to work with it in Google Sheets / Excel (Unfortunately our workplace uses both for some reason)
I have some weekly reports that I've somehow ended up generating (Getting data via SQL, massaging the data, and presenting via a dashboard or sharing a spreadsheet.
For creating a repeatable set of calculations when someone asks for something (which I'm sort of doing via Powerquery or Google Apps Script)
I'm quite big on visualizations, so I want to give Matplotlib a go.
And I do of coding (Javascript & C++(Arduino)), and have always wanted to add Python to my list of skills, especially in recent times, as I begin to delve more into Data.
Those sound like perfect scenarios! One of the first projects that got me hooked on python was processing large csv files instead of opening them in excel and running visual basic on them.
If you haven't already, you should check out duck db for working with your larger data sets, too. It's pretty neat. https://duckdb.org/
I've had a brief look into duckdb, and not too sure if I'm interpreting it's use case correctly, but does it basically allow you to use SQL within your Python to query your large datasets that you have locally?
That's right. You can read in structured files and query them locally without having to load into a database. It's nice in the case where you would rather write analytics sql, or want to convert between sql and pandas. It's very quick to load and run files. It can connect to databases, too.
Oooh that sounds pretty promising - I've been struggling with how to handle quite large datasets when they don't live within a Database.
Thank you for enlightening me! :) - I might have to send you some messages or the like later if I have any questions if that's okay with you?
Sure thing!
Thank you :)