The hardest part is getting people to understand that it is interactive! People expect a document-looking webpage to be static, but we can do so much better!
all are by the DuckDB team except three third-party owners. I’m unfamiliar with Vortex, but presume it’s like LanceDB and MotherDuck with a serious company behind it. and presumably the DuckDB team trusts them not to ship malware in their extension
Thanks for the link. Good to know that they are at least signed by a key. But I really like my software not changing on me at all. I'd rather have all of the modules I need locally and static.
Also creates fun situations like getting on a plane then realizing that your extension isn't available!
It seems that nixpkgs at least fails to run the extension but more by luck than design. I hope they find a way to vendor the extensions locally.
You can point DuckDB to almost any data source and boom, you get an SQL table that you can search, sum, or join to any other data. Or you can attach existing databases from completely independent db systems, and query and join them as one, without having to first importing anything.
It feels exhilarating (if you're into that sort of thing!)
My honeymoon with duckdb wore off pretty quickly when I need to compile it, myself, into a single-file concordance. I understand it's open source, so I'm free to be ignored. But, it's positioning itself as a drop-in replacement for SQLite; a large part of SQLite's appeal is its ergonomics — its single-fileness — letting me deliver a rational object to my users.
EDIT: "drop-in replacement like SQLite", not "for SQLite".
> it's positioning itself as a drop-in replacement for SQLite
While SQLite is often used for comparison (“SQLite for OLAP”), I’ve never seen DuckDB market itself as a “drop-in” replacement. Where did you see that?
Sqllite and duckdb serve pretty different niches; duckdb is less embeddable but on the OLAP side it’s by far the best today. I wouldn’t ever see them as competing for the same app, though
It works fine for this small set of emails, although the search isn't great, and there was more preprocessing that I would have liked. (I would prefer to be able to point a single binary at a pst or mbox file, and have it magically serve it like this, even if it means I need a VPS to serve it.)
Here's one: a client of mine has a bunch of SnapLogic pipelines that are configured to send errors via email, and there is no other persistent logging system. This results in tens of thousands of emails that are insanely hard to search and parse for any useful auditing.
DuckDB is great, and DuckDB-WASM is magic.
I build a whole LLM benchmark system around it that lets you run the whole benchmark in your browser: https://sql-benchmark.nicklothian.com/#sample-queries-and-sq...
Click on a cell and you can run the SQL the LLM generated vs what the solution is: https://sql-benchmark.nicklothian.com/?highlight=ggml-org_ge...
The hardest part is getting people to understand that it is interactive! People expect a document-looking webpage to be static, but we can do so much better!
This is awesome. Is your code open source? It would be cool to make a textbook for SQL in this format.
It seems that DuckDB by default downloads and runs extensions at runtime when you use certain features? This seems unnecessarily risky.
https://duckdb.org/docs/current/extensions/overview#autoload...
I would love to have more detail on this mechanism.
I believe as it states that’s only for the core extensions listed here: https://duckdb.org/docs/current/core_extensions/overview
all are by the DuckDB team except three third-party owners. I’m unfamiliar with Vortex, but presume it’s like LanceDB and MotherDuck with a serious company behind it. and presumably the DuckDB team trusts them not to ship malware in their extension
I think it’s a UX trade off that benefits users with minimal security downsides. and you can configure this behavior. some docs here: https://duckdb.org/docs/current/operations_manual/securing_d...
Thanks for the link. Good to know that they are at least signed by a key. But I really like my software not changing on me at all. I'd rather have all of the modules I need locally and static.
Also creates fun situations like getting on a plane then realizing that your extension isn't available!
It seems that nixpkgs at least fails to run the extension but more by luck than design. I hope they find a way to vendor the extensions locally.
I'm relatively new to DuckDB (coming from SQLite) and I love it so far. Some parts are magical (described in the previous article by the same author: https://peterdohertys.website/blog-posts/dab-of-duck.html)
You can point DuckDB to almost any data source and boom, you get an SQL table that you can search, sum, or join to any other data. Or you can attach existing databases from completely independent db systems, and query and join them as one, without having to first importing anything.
It feels exhilarating (if you're into that sort of thing!)
We wrapped exactly this into a GUI - attach MySQL and PostgreSQL, files/ s3 as sources, query them together with DuckDB. No imports. https://streams.dbconvert.com/cross-database-sql
My honeymoon with duckdb wore off pretty quickly when I need to compile it, myself, into a single-file concordance. I understand it's open source, so I'm free to be ignored. But, it's positioning itself as a drop-in replacement for SQLite; a large part of SQLite's appeal is its ergonomics — its single-fileness — letting me deliver a rational object to my users.
EDIT: "drop-in replacement like SQLite", not "for SQLite".
> it's positioning itself as a drop-in replacement for SQLite
While SQLite is often used for comparison (“SQLite for OLAP”), I’ve never seen DuckDB market itself as a “drop-in” replacement. Where did you see that?
Sqllite and duckdb serve pretty different niches; duckdb is less embeddable but on the OLAP side it’s by far the best today. I wouldn’t ever see them as competing for the same app, though
Has anyone used DuckDB (or anything else) to create an open source way to publish a mailbox so that a regular person can browse it and search it?
I'm aware of jmail.world, but they haven't (yet?) published the source code.
I had Claude hack something together recently: https://healdsburg-youcubed-emails.vercel.app/
It works fine for this small set of emails, although the search isn't great, and there was more preprocessing that I would have liked. (I would prefer to be able to point a single binary at a pst or mbox file, and have it magically serve it like this, even if it means I need a VPS to serve it.)
Simon Willison wonderful Datasette plus the mbox-to-sqlite extension is exactly what you want.
https://datasette.io/
https://github.com/simonw/mbox-to-sqlite
Maybe https://github.com/wesm/msgvault will do what you need?
Thanks.
msgvault seems really good. The tui is fast and FTS5 search works well.
I will definitely use it.
But it doesn't allow me to make a mailbox accessible to a wide audience, because:
- AFAICT there's no web version
- inline images don't show up
What’s your use case for this?
Here's one: a client of mine has a bunch of SnapLogic pipelines that are configured to send errors via email, and there is no other persistent logging system. This results in tens of thousands of emails that are insanely hard to search and parse for any useful auditing.
Making it easy for members of the public to search and browse email sent by or to government employees.
These emails aren't published by default but email archives are often included in responses to public record requests.
Ideally anyone who receives one of these archives would be able easily inspect it themselves, and also make it available to others.
Anyone tried both duckdb and clickhouse local?
Clickhouse seems less marketed, but seems quite similar.
I didn’t realize they had chDB, honestly, need to give that a shot! The local CLI isn’t quite the same ergonomically