Wow, so many datasets — and almost no datasets about data centres, illustration by: UnDraw.

25 Million Datasets and Not One About Google’s Data Centres

Google’s new platform for datasets lacks data on data centres

It must be said right away that I am so happy Google has made this framework to search for datasets. It truly makes me very glad and I enjoy using this service. They said on their blog the 23rd of January:

“Across the web, there are millions of datasets about nearly any subject that interests you. If you’re looking to buy a puppy, you could find datasets compiling complaints of puppy buyers or studies on puppy cognition.”

However I was sort of hoping the first thing I would find would be data from Google’s data centres easily accessible. That was not the case then.

Still, it is now April 2020 and is it not slightly embarrassing that I can find information about complaints of puppy buyers — yet no datasets with emission numbers from Google?

When I search for Google and sustainability there is nothing directly related either.

In a way you would hope the section that worked with sustainability in Google was aware of this and took action — perhaps it is on its way?

It could very well be the case.

Then again and sustainability, not much either.

Data centres for Amazon then? Nope.

Well surely, Apple would have… Nope.

At least Microsoft, right, surely they must have their numbers up on this new platform for search?

Not at all.

The lack of data on data centres

I wrote an article previously on the lack of data on data centres.

It is frightfully ironic and almost (if not certainly) comical.

So much data out there and so little data about how it is managed.

It is a pervasive problem, and from what I can see an issue in regards to making these larger technology companies take responsibility of their emissions relating to operations in the data centre industry.

It is possible and should be possible to make these datasets available.

The best outcome would be for this article to bring all the datasets on emissions from data centres from all companies readily available on this search engine.

That would be a dream come true!

I know the world is dealing with the Coronavirus, and we are on lockdown, yet the larger issues with the climate crisis will persist unless we do everything we can to bring down emissions.

Working from home and ramping up all digital will lead to an increase in digital infrastructure, and we have to realise that has a footprint too.

We need more data about data centres.

Yes, these large companies have sustainability departments

As a side note I do understand that these companies have sections working with sustainability and tech for good. I have covered this previously. I also asked this question on in a post called Data Centres Emission Numbers of the Largest Tech Companies, and received the following answer mostly based on usage:

I understand that there is information on this topic, yet my argument still pertains to making these datasets available at least to some degree or a larger extent.

This is #500daysofAI and you are reading article 322. I am writing one new article about or related to artificial intelligence every day for 500 days. My focus for day 300–400 is about AI, hardware and the climate crisis.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store