csvbase is a simple website for sharing table data. Join the discord.

From Shell to Excel - with a little bit of HTTPS

train station data in MS Excel

Write once, read everywhere

2024-08-12

RPC fans hate him! Save your data with this one weird URL!

csvbase is an open source database, on the web. It is made of up tables. The tables have urls, of the form:

https://csvbase.com/<username>/<table_name>

You interact with the tables using HTTP verbs:

  • GET, er, gets the table
  • PUT inserts a csv file as a table at the given url (overwriting if it exists)
    • you append ?public=yes when creating a public table
  • DELETE deletes a table
  • POST appends to a table

Hopefully none of that is too mindblowing. It's just "REST". I always wished there was a simple database like this, so I made it.

What about API keys? Well, if you're just using GET, you don't need one. If you're using another verb, use HTTP Basic Auth, the standard HTTP password field.

csvbase could probably win a prize for the most boring API of all time.

Be boring, and Free

But being boring makes csvbase easy to use in scripts. Here's how to yeet a csv file into csvbase with curl:

curl -n -X PUT http://csvbase.com/calpaterson/my_table?public=yes \
    --data-binary @my_table.csv

Just to break that down:

  • -n reads your username and password from your ~/.netrc file
  • -X PUT makes curl use the HTTP PUT verb
  • --data-binary @my_table.csv says which file to upload

That is it. Once you run that, your table is created on csvbase. Each table gets its own little webpage, and you can browse around; link to specific rows; download it in other formats; etc etc. That webpage looks like this:

the trainstations page on csvbase

And if you change my_table.csv, you just run that curl command again to re-upload the new version.

Perhaps you don't want to entrust your data to csvbase.com. Well, go with god, I suppose: you can download the source code and run it yourself (some assembly required). Elites don't want you to know this, but the bytes on the internet are free, and you can take them home with you.

Don't want to adopt a new data analysis tool? Please: don't.

Sometimes people ask whether csvbase is going to implement more advanced data analysis features. You know, like grouping, drawing graphs, fancy AI stuff, etc etc. Probably not.

The world really does not need any new tools to analyse data. There are already so many. Instead of building a complicated walled garden of (presumably venture-funded?) data-analysis features csvbase just takes the small steps to ensure that it is compatible with all the stuff that already exists.

For example, if you like DuckDB, you probably like the Parquet file format. csvbase will happily serve tables to DuckDB in Parquet format: just add .parquet to the end of the table's url. You can even load them from from inside a DuckDB query:

-- load GB train stations
SELECT *
FROM read_parquet("https://csvbase.com/rmirror/trainstations.parquet");

Maybe you like MS Excel. You guessed it, just add .xlsx to the end of the table's url:

https://csvbase.com/rmirror/trainstations.xlsx

You can even bring a csvbase table into an existing Excel spreadsheet as a data source:

  1. Go to Data > From Web
  2. Paste in the table url, with .xlsx added on the end and click "Ok".
  3. Click "Load data"

And bang, you have this:

the trainstations dataset in MS Excel

Excel weenies will notice that this is using a bit of Excel called "Power Query". Power Query is actually quite a sophisticated bit of kit. For example, it knows how to periodically check for updates, allowing you to create spreadsheets that always stay up to date (for more details, see the FAQ entry on using Excel).

It's easy to be snarky about Excel. It is used in incredibly inappropriate ways. I can reveal to you that some investment bank front offices use MS Excel to execute derivative trades. Big ones. Down-at-heel Formula 1 teams use it to organise their chassis builds. And silly usage of Excel as a database is widespread.

Excel is a bad database. But it was never really meant to be a database. It was meant to be a data analysis tool. And it's pretty good at that. Here is a heatmap of the train stations I was able to make with just what is built into Excel:

the trainstations dataset in MS
Excel

It takes exactly 1 minute to create that heatmap (video below). You can see some interesting patterns - for example I didn't know there was a railway running all the way along the south coast of England.

Would it make sense for csvbase to rebuild that functionality? I doubt it.


But if you think csvbase is cool:


Notes

Here's the video of me creating the heatmap for those of you following along at home:

On some installations of Excel the "3D maps" plugin won't be enabled by default. I don't know what causes this but you might need to:

  1. Enable the COM add-in for it
  2. Customise the Ribbon to put the "3D maps" button somewhere

Why does csvbase implement XLSX format when Excel can already read CSV format? Because Excel is very bad indeed at reading CSVs and will screw up all the character encoding and types.

Sadly, there is no real equivalent of Power Query in LibreOffice.

Comments

No comments yet - add the first one
Markdown is supported