Exporting and importing CSV data in RavenDB

In today’s episode of not-entirely-obvious RavenDB functionality, I was trying to use the CSV import/export functionality in order to transfer a collection from our live to our test DB, since at the time of writing the “export database” functionality seems to be broken if you try to use it to export only specified collections.

Exporting is obvious enough – simply navigate to a collection you’re interested in exporting, optionally select the documents you want to export by ticking the checkboxes, and then click the export CSV button as shown below.

export_button

This will download a file called export.csv. You can then go to Tasks -> CSV Import to import the data into another database. However, if you just import the CSV as it came out of RavenDB, you’ll find yourself creating a new collection called “export”, with all your documents from the CSV imported with auto-generated GUIDs for the document IDs.

If, instead, you’d like to import the documents into a collection with the same name and IDs as the original, as I did, you’ll need to do the following:

  1. Rename the CSV file to match the desired collection name, for example Cars.csv would import into a collection called Cars

  2. Column names in the CSV with an @ sign before them will be ignored on import. If you want to preserve IDs, open the CSV and change the @id header to id, and the IDs will be included when importing.

Please note however that RavenDB will not update the HILOs if you import data from a CSV, so you’ll need to make sure that you update those yourself separately to avoid any problems when you try to create new documents in the same collection.

Working with time intervals in PostgreSQL and Excel

By default, PostgreSQL (PSQL to its friends) formats time intervals in a way that makes it much more human-readable, which is nice until you have intervals longer than a day and you want to manipulate the data further in something like Microsoft Excel.

To make the time intervals machine-readable again, you can use something like the following (which I found here):

SELECT date_part('epoch', time_interval_field) * INTERVAL '1 second';

This extracts the number of seconds from your interval, and then multiplies it by an interval in order to convert it back to the INTERVAL type. Seems silly, but this forces the formatting into hh:mm:ss, so if you have more than 24 hours, rather than seeing 1 day 12:00:35.12312, you’ll get 36:00:35.12312.

Then all that remains is to use a custom format in Excel when loading your data:

[h]:mm:ss

The square brackets tell Excel to allow values of more than 24 hours (rather than calculating hours % 24 as it would without), and the rest is fairly self-explanatory. If you care about milliseconds, you can include them with [h]:mm:ss.00000.

How to get user IDs in Slack

For those who haven’t yet used it, Slack is a powerful communication platform, based around a chatroom-style way of interacting. I won’t go into details here, except to say that it also has an API that provides various methods to interact with the system, many of which require you to supply a user ID to act on, which is not visible in any of the admin UI sections.

Therefore the easiest way to get IDs for your users is to call the users.list method, which you can run from their test harness at https://api.slack.com/methods/users.list/test. This will give you a full list of users along with their IDs ready for you to use.