25 Million Datasets and Not One About Google’s Data Centres
Google’s new platform for datasets lacks data on data centres
It must be said right away that I am so happy Google has made this framework to search for datasets. It truly makes me very glad and I enjoy using this service. They said on their blog the 23rd of January:
“Across the web, there are millions of datasets about nearly any subject that interests you. If you’re looking to buy a puppy, you could find datasets compiling complaints of puppy buyers or studies on puppy cognition.”
However I was sort of hoping the first thing I would find would be data from Google’s data centres easily accessible. That was not the case then.
Still, it is now April 2020 and is it not slightly embarrassing that I can find information about complaints of puppy buyers — yet no datasets with emission numbers from Google?
When I search for Google and sustainability there is nothing directly related either.
In a way you would hope the section that worked with sustainability in Google was aware of this and took action — perhaps it is on its way?
It could very well be the case.
Then again Amazon.com and sustainability, not much either.
Data centres for Amazon then? Nope.
Well surely, Apple would have… Nope.
At least Microsoft, right, surely they must have their numbers up on this new platform for search?
Not at all.
The lack of data on data centres
I wrote an article previously on the lack of data on data centres.
The Lack of Data on Data Centres
Why is it so hard to find reliable data on data storage and emissions?
It is frightfully ironic and almost (if not certainly) comical.
So much data out there and so little data about how it is managed.
It is a pervasive problem, and from what I can see an issue in regards to making these larger technology companies take responsibility of their emissions relating to operations in the data centre industry.
It is possible and should be possible to make these datasets available.
The best outcome would be for this article to bring all the datasets on emissions from data centres from all companies readily available on this search engine.
That would be a dream come true!
I know the world is dealing with the Coronavirus, and we are on lockdown, yet the larger issues with the climate crisis will persist unless we do everything we can to bring down emissions.
Working from home and ramping up all digital will lead to an increase in digital infrastructure, and we have to realise that has a footprint too.
We need more data about data centres.
Yes, these large companies have sustainability departments
As a side note I do understand that these companies have sections working with sustainability and tech for good. I have covered this previously. I also asked this question on climatechange.ai in a post called Data Centres Emission Numbers of the Largest Tech Companies, and received the following answer mostly based on usage:
- “Google: 10 TWh in 2018 (source: 2019 Environment Report 1)
- Microsoft: 8 TWh in 2018, see PDF for different GHG estimates (source: 2018 Data Factsheet: Environmental Indicators)
- Facebook: 3.4 TWh in 2018, and around 340 kt CO2 in Scope 1 and 2 emissions (source: 2018 Sustainability Data Disclosure)
- Apple: 2.2 TWh in 2018 and 63 kt CO2e Scope 1 and 2 emissions (source: 2019 Environmental Responsibility Report)
- Netflix: 254 GWh in 2018 (source: Netflix renewable energy update, March 2019)
- Amazon is the largest energy user among FAANG, but I could not find any data.”
Sustainability in the cloud
AWS is committed to running our business in the most environmentally friendly way possible. As part of Amazon's…
Amazon Sustainability Data Initiative
Providing access to large datasets in the cloud helps researchers and innovators address a wide range of sustainability…
Sustainability is part of everything we do at Google. We are committed to renewable energy, efficient operations, and…
Efficiency - Data Centers - Google
The cloud supports many products at a time, so it can more efficiently distribute resources among many users. That…
I understand that there is information on this topic, yet my argument still pertains to making these datasets available at least to some degree or a larger extent.
This is #500daysofAI and you are reading article 322. I am writing one new article about or related to artificial intelligence every day for 500 days. My focus for day 300–400 is about AI, hardware and the climate crisis.