Image for post
Image for post
Wow, so many datasets — and almost no datasets about data centres, illustration by: UnDraw.

25 Million Datasets and Not One About Google’s Data Centres

Google’s new platform for datasets lacks data on data centres

It must be said right away that I am so happy Google has made this framework to search for datasets. It truly makes me very glad and I enjoy using this service. They said on their blog the 23rd of January:

However I was sort of hoping the first thing I would find would be data from Google’s data centres easily accessible. That was not the case then.

Still, it is now April 2020 and is it not slightly embarrassing that I can find information about complaints of puppy buyers — yet no datasets with emission numbers from Google?

Image for post
Image for post

When I search for Google and sustainability there is nothing directly related either.

Image for post
Image for post

In a way you would hope the section that worked with sustainability in Google was aware of this and took action — perhaps it is on its way?

It could very well be the case.

Then again and sustainability, not much either.

Image for post
Image for post

Data centres for Amazon then? Nope.

Image for post
Image for post

Well surely, Apple would have… Nope.

Image for post
Image for post

At least Microsoft, right, surely they must have their numbers up on this new platform for search?

Not at all.

Image for post
Image for post

The lack of data on data centres

I wrote an article previously on the lack of data on data centres.

It is frightfully ironic and almost (if not certainly) comical.

So much data out there and so little data about how it is managed.

It is a pervasive problem, and from what I can see an issue in regards to making these larger technology companies take responsibility of their emissions relating to operations in the data centre industry.

It is possible and should be possible to make these datasets available.

The best outcome would be for this article to bring all the datasets on emissions from data centres from all companies readily available on this search engine.

That would be a dream come true!

I know the world is dealing with the Coronavirus, and we are on lockdown, yet the larger issues with the climate crisis will persist unless we do everything we can to bring down emissions.

Working from home and ramping up all digital will lead to an increase in digital infrastructure, and we have to realise that has a footprint too.

We need more data about data centres.

Yes, these large companies have sustainability departments

As a side note I do understand that these companies have sections working with sustainability and tech for good. I have covered this previously. I also asked this question on in a post called Data Centres Emission Numbers of the Largest Tech Companies, and received the following answer mostly based on usage:

I understand that there is information on this topic, yet my argument still pertains to making these datasets available at least to some degree or a larger extent.

This is #500daysofAI and you are reading article 322. I am writing one new article about or related to artificial intelligence every day for 500 days. My focus for day 300–400 is about AI, hardware and the climate crisis.

Written by

AI Policy and Ethics at Student at University of Copenhagen MSc in Social Data Science. All views are my own.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store