Deceased's biographies thanks to collected data

In this article, we will project ourselves into the future to expose a possible application of artificial intelligence and Big Data. The idea is to automatically produce biographies of individuals who just died, from the large amount of digital data they have generated throughout their life. For this, we will make a small study of the current state of technology to anchor this idea in a near reality.


Artificial intelligence: Artificial intelligence is currently a field of research in full expansion, with a competitive course between the private and public sector. This deep paradigm switch operates in finance (with Fintech — use of advanced technology for financial purposes), and among private actors like the GAFAM (Google, Apple, Facebook, Amazon, Microsoft), in the automotive sector (to develop autonomous vehicles), but also in the cloud, in big data (data science), and in robotics industries (e.g. Boston Dynamics, Hanson Robotics).

This competition also exists on the geopolitical and military level (see here — 2017), because AI applications represent a considerable panorama of possible uses (we may need to mention China's social credit system). This technology is developing very quickly through innovative training methods (see this link to learn more — 2019).

Big Data: Big Data explosion took place around 2010, when companies began massively storing data generated by their online platforms and services users. As you should know, every time you use the internet, your data ends up in data centers, hence could be exploited (the job of data scientists).

The amount of data stored since 2010 follows an exponential growth curve. The arrival of artificial intelligence made it possible to accelerate the process of analysis and exploitation of these data.

Produced data:

  • Global data volume in 2017: 25 Zettabytes according to the IDC (Iternational Data Corporation).

  • Number of internet users: 4,021 Billions in january 2018 according to a We Are Social report (53% of the global population).

  • Daily produced data by user in 2012: 5 Gigabytes by US user according to a MIT study (Massachusetts Institute of Technology).

  • Daily produced data by user in 2017: If we roughly calculate the average with the above numbers, the data produced and stored annually is the astronomical number of 6.2 Terrabytes for a single internet user. That's 17 gigabytes per day per internet user. As an example, a simple email is about 50ko and a music track about 4Mo. Obviously, the calculated result is not representative of the large inequalities in the internet use, but is still an interesting witness. This huge number can be explained by the fact that some people have access to unlimited broadband services, and by the amount of content posted and downloaded on the internet every day (Snapchat, YouTube, Netflix, etc.). There is also applications and services that are constantly connected to the internet, such as surveillance camera networks.

Idea shot :

Given the forecasts of the IDC, we will see this number continue to grow exponentially. Although, much of the data collected is technical content — such as hour of connection, time of use, navigation, etc — there is also strong informative content, such as content present on social networks.

The main idea is to exploit the numerical data produced by an individual throughout his life to automatically create a faithful biography of her or his life post mortem, in order to present it to posterity. It could be possible thanks to an AI able to exploit these data. To not enter into a Big Brother-style technological dystopia, the individual rights relating to the exploitation of personal data should be respected, thus having the agreement of the person before she or he dies is important. A document specifying the agreement previously drafted by the person or explaining one's desire to not benefit from this post mortem service seems necessary.

To me, the right to benefit from such a service should be universal and free. It could work similarly as the Wikipedia non-profit organization, but chances are it would be privatized (well, this world isn't perfect). Technically speaking, the artificial intelligence in charge of this task should be trained to detect information with real biographical value (in all the data produced by the deceased), then process to sort it, respecting the chronological structure to write an accurate biography.

For that, this AI project could be a good start.

Here is a list of datasets that could have a real biographical value:

  • The data produced on social media, which provide information about our career, both professional and personal, as well as our interests and relationships.

  • Social and public data, which is currently migrating from paper to digital media. It include information on our health journey.

  • The professional datafrom the companies we worked for, which reveal information about our professional career.

  • The educational and academic journey that yields valuable insights into how we approached the professional world and our career reworkings.

  • Personal successes and failures, which usually appear visibly in the different sets discussed above.

However, the important question of a centralized access to all these data by the AI still remain.

The automatized production of biographies of deceased people lives with the help of their internet data and AI will make it possible to create a huge digital sanctuary in honor of people who have left this world.

Open to thoughts:

This idea could contribute greatly to our culture, thanks to an accessible and historical record of each individual life who has accepted that his or her biography be written. This project could also help to keep the memory of whole groups by revealing the relational links — the dynamics — between them.

Indeed, the contribution to the field of social sciences, such as history, sociology, or even education, would be important.

License CC BY



©2019 by SparkVortex